Spaces:
Running
Running
Commit History
models : add `q8_0` models to `download-ggml-model.sh` (#2589)
7feeb43
unverified
ruby : Follow source tree change (#2580)
7895d75
unverified
whisper : use backend registry (#0)
b9f5e40
ggml/sched : do not skip views in pre-assignments
b1eba61
slaren
commited on
whisper : adapt to new ggml (wip)
ec6f374
talk-llama : sync llama.cpp
1568fc8
sync : ggml
e3c317a
ggml : sync resolve (skip) (#0)
d4d67dc
Add required ggml-base and backend libs to cmake pkg (llama/10407)
8fdd994
bandoti
commited on
cuda : fix CUDA_FLAGS not being applied (llama/10403)
22e1593
Diego Devesa
commited on
sycl : Add option to set the SYCL architecture for all targets (llama/10266)
0d836df
Romain Biessy
commited on
vulkan: Optimize soft_max (llama/10301)
5cb851d
sycl: Revert MUL_MAT_OP support changes (llama/10385)
6df9941
Alberto Cabrera Pérez
commited on
cuda : only use native when supported by cmake (llama/10389)
24d2e82
Diego Devesa
commited on
vulkan: remove use of null initializer (llama/10372)
dacdc69
metal : fox offset integer overflows in im2col (ggml/1015)
efbd100
Vulkan: Fix device info output format specifiers (llama/10366)
8000df9
metal : add `GGML_UNARY_OP_ELU` kernel (ggml/1018)
5959420
CUDA: fix MMV kernel being used for FP16 src1 (llama/10357)
af4dff1
CMake: fix typo in comment [no ci] (llama/10360)
d324d0b
llama : only use default buffer types for the KV cache (llama/10358)
9e9c0ad
Diego Devesa
commited on
metal : refactor kernel args into structs (llama/10238)
15659b4
ggml : fix undefined reference to 'getcpu' (llama/10354)
2f9b147
FirstTimeEZ
commited on
CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318)
e446f60
CMake: default to -arch=native for CUDA build (llama/10320)
66edfb6
ggml : fix possible buffer use after free in sched reserve (llama/9930)
4703ea3
Diego Devesa
commited on
ggml : inttypes.h -> cinttypes (llama/0)
6ba2c8f
ggml : adapt AMX to tensor->grad removal (llama/0)
8a67e9f
ggml : fix compile warnings (llama/0)
80d6ec0
llamafile : fix include path (llama/0)
e443f89
vulkan: Optimize some mat-vec mul quant shaders (llama/10296)
dc0e685
ggml : optimize Q4_0 into Q4_0_X_Y repack (llama/10324)
abf6f22
Dan Johansson
commited on
Make updates to fix issues with clang-cl builds while using AVX512 flags (llama/10314)
2868c2b
Srihari-mcw
commited on
ggml: new optimization interface (ggml/988)
dd33ace
ggml : remove duplicated sources from the last sync (ggml/1017)
026d20b
ggml : fix some build issues
c5ba1d1
slaren
commited on
sync : leftovers (ggml/0)
0f6c498
cmake : restore CMakeLists.txt (llama/10256)
51a70ff
AVX BF16 and single scale quant optimizations (llama/10212)
e6ffed3
Eve
commited on
sycl: Use syclcompat::dp4a (llama/10267)
ce0dc30
Romain Biessy
commited on
backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (llama/9921)
3541ee8
Charles Xu
Diego Devesa
commited on
ggml : build backends as libraries (llama/10256)
3dc93f3
scripts : update sync
1741306
release : v1.7.2
414329d
unverified
sycl: fix example build (#2570)
a0dcffc
unverified
Stefan Sydow
commited on