whisper.cpp / ggml-quants.h

Commit History

ggml : reuse quantum structs across backends (llama/5943)
bb0625f
unverified

ggerganov commited on

Better 1.5 bit quantization (llama/5971)
f3a62cc
unverified

Kawrakow ikawrakow commited on

ggml : remove old quantization functions (llama/5942)
11a2545
unverified

ggerganov commited on

ggml : add ggml-common.h to deduplicate shared code (llama/5940)
0a37735
unverified

ggerganov commited on

ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (llama/5760)
9a07f42
unverified

Kawrakow ikawrakow commited on

IQ4_XS: a 4.25 bpw quantization (llama/5747)
0ee1bfb
unverified

Kawrakow ikawrakow commited on

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721)
2b9bb9e
unverified

Kawrakow ikawrakow ggerganov commited on

IQ3_S: a much better alternative to Q3_K (llama/5676)
32589c9
unverified

Kawrakow ikawrakow commited on

sync : llama.cpp (ggml/0)
f8e8d34
unverified

ggerganov commited on

1.5 bit quantization (llama/5453)
9c3aa6a
unverified

Kawrakow ikawrakow commited on

ggml : add mmla kernels for quantized GEMM (llama/4966)
0d50a29
unverified

snadampal commited on

ggml : make use of ggml-quants.h possible in C++ code (llama/5338)
963ade6
unverified

Kawrakow ikawrakow commited on

SOTA 3-bit quants (llama/5196)
4649943
unverified

Kawrakow ikawrakow commited on

ggml : add IQ2 to test-backend-ops + refactoring (llama/4990)
227f2ae
unverified

ggerganov commited on

ggml : importance matrix support for legacy quants (llama/4969)
d8bb9d8
unverified

Kawrakow ikawrakow commited on

Add ability to use importance matrix for all k-quants (llama/4930)
7032309
unverified

Kawrakow ikawrakow commited on

2-bit quantizations (llama/4897)
8a399ab
unverified

Kawrakow ikawrakow commited on

ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856)
5e827d5
unverified

Kawrakow ikawrakow commited on

SOTA 2-bit quants (llama/4773)
75de5bf
unverified

Kawrakow ikawrakow commited on

ggml : fix q2_k bpw in comments (ggml/680)
269f9a0
unverified

ggerganov commited on

sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422)
7006035
unverified

ggerganov Chris Raethke commited on