Spaces:
Running
Running
Commit History
backend_sched : fix assignments cb91db5 unverified
slaren commited on
llama : ggml-backend integration (llama/4766) 362430b unverified
CUDA: fix softmax compile for old CUDA versions (llama/4862) 5eda533 unverified
models : make all scripts to be POSIX Compliant (#1725) f7aef3e unverified
ggml : fix 32-bit ARM compat for IQ2_XS (#1758) d5836c9 unverified
go : add SetInitialPrompt method to bindings (#1753) 5fd6678 unverified
server : add more parameters to server api (#1754) cb0cf7b unverified
George Hindle commited on
whisper : fix segment length with params.no_timestamps == true 720d738 unverified
params : don't compute timestamps when not printing them (#1755) 251825e unverified
George Hindle commited on
talk-llama : sync llama.cpp f33490f unverified
swift : remove local ggml.h reference 98b68e8 unverified
swift : track ggml release branch ece2b9d unverified
sync : ggml 9af4c11 unverified
sync : llama.cpp 569565f unverified
ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856) 5e827d5 unverified
metal : put encoder debug group behind a define (llama/4873) 6e822b8 unverified
Paul Tsochantaris commited on
metal : improve dequantize precision to match CPU (llama/4836) f2da2a4 unverified
ggml : fix vld1q_s8_x4 32-bit compat (llama/4828) efed5ba unverified
CUDA: faster softmax via shared memory + fp16 math (llama/4742) 52c45b9 unverified
metal : fix deprecation warning (ggml/690) b1e29bc unverified
ggml : remove ggml_cpy_inplace and ggml_cont_inplace (ggml/693) 6469bfe unverified
Timothy Cronin commited on
metal : wrap each operation in debug group (ggml/690) b5e360f unverified
Jack Mousseau commited on
ggml : change GGML_MAX_NAME at compile time (ggml/682) ded2b1a unverified
Fix execlp call (ggml/689) abda16e unverified
Halalaluyafail3 commited on
SOTA 2-bit quants (llama/4773) 75de5bf unverified
CUDA: fixed redundant value dequantization (llama/4809) 70c8d60 unverified
ggml : use __builtin_amdgcn_sudot4 in __dp4a for gfx11 (llama/4787) f391d7a unverified
Konstantin Zhuravlyov commited on
ggml : do not sched_yield when calling BLAS (llama/4761) 5d1dffc unverified
ggml : include stdlib.h before intrin.h (llama/4736) 743cace unverified
swift : checkout ggml commit instead of branch (#1750) 6ab88cc unverified
Alexandru Mariuti commited on
talk-llama : add optional Piper TTS support (#1749) fb92e62 unverified
RhinoDevel commited on
server : add request path option(#1741) 6c319ac unverified
main : add cli option to disable system prints (#1740) 97e710a unverified
server : fix server temperature + add temperature_inc (#1729) 8a648fc unverified
talk-llama : sync latest llama.cpp 42123fc unverified
release : v1.5.4 96799a3 unverified
fix : cuda order of synchronization when setting a buffer (ggml/679) e48c553 unverified
Erik Scholz slaren commited on