Commit History

sync : ggml
2ed0a44
unverified

ggerganov HF Staff commited on

backend_sched : fix assignments
cb91db5
unverified

slaren commited on

CUDA: fix softmax compile for old CUDA versions (llama/4862)
5eda533
unverified

JohannesGaessler commited on

Importance Matrix calculation (llama/4861)
c0b17f1
unverified

Kawrakow ikawrakow ggerganov HF Staff commited on

models : make all scripts to be POSIX Compliant (#1725)
f7aef3e
unverified

sonphantrung commited on

ggml : fix 32-bit ARM compat for IQ2_XS (#1758)
d5836c9
unverified

ggerganov HF Staff commited on

go : add SetInitialPrompt method to bindings (#1753)
5fd6678
unverified

blib321 commited on

server : add more parameters to server api (#1754)
cb0cf7b
unverified

George Hindle commited on

whisper : fix segment length with params.no_timestamps == true
720d738
unverified

ggerganov HF Staff commited on

params : don't compute timestamps when not printing them (#1755)
251825e
unverified

George Hindle commited on

talk-llama : sync llama.cpp
f33490f
unverified

ggerganov HF Staff commited on

swift : remove local ggml.h reference
98b68e8
unverified

ggerganov HF Staff commited on

swift : track ggml release branch
ece2b9d
unverified

ggerganov HF Staff commited on

sync : ggml
9af4c11
unverified

ggerganov HF Staff commited on

sync : llama.cpp
569565f
unverified

ggerganov HF Staff commited on

ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856)
5e827d5
unverified

Kawrakow ikawrakow commited on

metal : put encoder debug group behind a define (llama/4873)
6e822b8
unverified

Paul Tsochantaris commited on

metal : improve dequantize precision to match CPU (llama/4836)
f2da2a4
unverified

ggerganov HF Staff commited on

ggml : fix vld1q_s8_x4 32-bit compat (llama/4828)
efed5ba
unverified

ggerganov HF Staff commited on

CUDA: faster softmax via shared memory + fp16 math (llama/4742)
52c45b9
unverified

JohannesGaessler commited on

metal : fix deprecation warning (ggml/690)
b1e29bc
unverified

ggerganov HF Staff commited on

ggml : remove ggml_cpy_inplace and ggml_cont_inplace (ggml/693)
6469bfe
unverified

Timothy Cronin commited on

metal : wrap each operation in debug group (ggml/690)
b5e360f
unverified

Jack Mousseau commited on

ggml : change GGML_MAX_NAME at compile time (ggml/682)
ded2b1a
unverified

leejet commited on

Fix execlp call (ggml/689)
abda16e
unverified

Halalaluyafail3 commited on

SOTA 2-bit quants (llama/4773)
75de5bf
unverified

Kawrakow ikawrakow commited on

CUDA: fixed redundant value dequantization (llama/4809)
70c8d60
unverified

JohannesGaessler commited on

ggml : use __builtin_amdgcn_sudot4 in __dp4a for gfx11 (llama/4787)
f391d7a
unverified

Konstantin Zhuravlyov commited on

ggml : do not sched_yield when calling BLAS (llama/4761)
5d1dffc
unverified

ggerganov HF Staff commited on

ggml : include stdlib.h before intrin.h (llama/4736)
743cace
unverified

ggerganov HF Staff commited on

swift : checkout ggml commit instead of branch (#1750)
6ab88cc
unverified

Alexandru Mariuti commited on

talk-llama : add optional Piper TTS support (#1749)
fb92e62
unverified

RhinoDevel commited on

server : add request path option(#1741)
6c319ac
unverified

eschmidbauer commited on

main : add cli option to disable system prints (#1740)
97e710a
unverified

ggerganov HF Staff commited on

server : fix server temperature + add temperature_inc (#1729)
8a648fc
unverified

ggerganov HF Staff commited on

talk-llama : sync latest llama.cpp
42123fc
unverified

ggerganov HF Staff commited on

release : v1.5.4
96799a3
unverified

ggerganov HF Staff commited on

fix : cuda order of synchronization when setting a buffer (ggml/679)
e48c553
unverified

Erik Scholz slaren commited on

metal : switch back to default.metallib (ggml/681)
b945a8f
unverified

ggerganov HF Staff commited on

ggml : fix q2_k bpw in comments (ggml/680)
269f9a0
unverified

ggerganov HF Staff commited on

coreml : fix ANE optimized encoder (#1716)
a75904e
unverified

philloooo commited on

whisper.swiftui : add .gitignore
8061081
unverified

ggerganov HF Staff commited on

whispser : reset the "batched" timings (#1721)
f02be35
unverified

ggerganov HF Staff commited on

release : v1.5.3
1f8a047
unverified

ggerganov HF Staff commited on

swift : update Package.swift to use ggml as package dependency (#1701)
77f731f
unverified

1-ashraful-islam commited on

ggml : add error handling to graph_compute (#1714)
92f24ee
unverified

finnvoorhees commited on

cuda : simplify expression
cda4a91

ggerganov HF Staff slaren commited on

cuda : mark I16 and I32 ops as unsupported
cec288d

ggerganov HF Staff commited on

metal : add kernel_get_rows_i32
459dd87

ggerganov HF Staff commited on