Commits · Xenobd/whisper.cpp

ggml : reuse quantum structs across backends (llama/5943)

bb0625f
unverified

ggerganov commited on Mar 12, 2024

Better 1.5 bit quantization (llama/5971)

f3a62cc
unverified

Kawrakow

ikawrakow commited on Mar 11, 2024

ggml : remove old quantization functions (llama/5942)

11a2545
unverified

ggerganov commited on Mar 9, 2024

ggml : add ggml-common.h to deduplicate shared code (llama/5940)

0a37735
unverified

ggerganov commited on Mar 9, 2024

ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (llama/5760)

9a07f42
unverified

Kawrakow

ikawrakow commited on Feb 28, 2024

IQ4_XS: a 4.25 bpw quantization (llama/5747)

0ee1bfb
unverified

Kawrakow

ikawrakow commited on Feb 27, 2024

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721)

2b9bb9e
unverified

Kawrakow

ikawrakow

ggerganov commited on Feb 26, 2024

IQ3_S: a much better alternative to Q3_K (llama/5676)

32589c9
unverified

Kawrakow

ikawrakow commited on Feb 24, 2024

sync : llama.cpp (ggml/0)

f8e8d34
unverified

ggerganov commited on Feb 21, 2024

1.5 bit quantization (llama/5453)

9c3aa6a
unverified

Kawrakow

ikawrakow commited on Feb 18, 2024

ggml : add mmla kernels for quantized GEMM (llama/4966)

0d50a29
unverified

snadampal commited on Feb 11, 2024

ggml : make use of ggml-quants.h possible in C++ code (llama/5338)

963ade6
unverified

Kawrakow

ikawrakow commited on Feb 5, 2024

SOTA 3-bit quants (llama/5196)

4649943
unverified

Kawrakow

ikawrakow commited on Jan 30, 2024

ggml : add IQ2 to test-backend-ops + refactoring (llama/4990)

227f2ae
unverified

ggerganov commited on Jan 17, 2024

ggml : importance matrix support for legacy quants (llama/4969)

d8bb9d8
unverified

Kawrakow

ikawrakow commited on Jan 16, 2024

Add ability to use importance matrix for all k-quants (llama/4930)

7032309
unverified

Kawrakow

ikawrakow commited on Jan 14, 2024

2-bit quantizations (llama/4897)

8a399ab
unverified

Kawrakow

ikawrakow commited on Jan 14, 2024

ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856)

5e827d5
unverified

Kawrakow

ikawrakow commited on Jan 11, 2024

SOTA 2-bit quants (llama/4773)

75de5bf
unverified

Kawrakow

ikawrakow commited on Jan 8, 2024

ggml : fix q2_k bpw in comments (ggml/680)

269f9a0
unverified

ggerganov commited on Jan 5, 2024

sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422)

7006035
unverified

ggerganov Chris Raethke commited on Nov 3, 2023

Spaces:

Duplicated from natasa365/whisper.cpp

Xenobd
/

whisper.cpp

Running

Commit History

ggml : reuse quantum structs across backends (llama/5943)

bb0625f
unverified

Better 1.5 bit quantization (llama/5971)

f3a62cc
unverified

ggml : remove old quantization functions (llama/5942)

11a2545
unverified

ggml : add ggml-common.h to deduplicate shared code (llama/5940)

0a37735
unverified

ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (llama/5760)

9a07f42
unverified

IQ4_XS: a 4.25 bpw quantization (llama/5747)

0ee1bfb
unverified

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721)

2b9bb9e
unverified

IQ3_S: a much better alternative to Q3_K (llama/5676)

32589c9
unverified

sync : llama.cpp (ggml/0)

f8e8d34
unverified

1.5 bit quantization (llama/5453)

9c3aa6a
unverified

ggml : add mmla kernels for quantized GEMM (llama/4966)

0d50a29
unverified

ggml : make use of ggml-quants.h possible in C++ code (llama/5338)

963ade6
unverified

SOTA 3-bit quants (llama/5196)

4649943
unverified

ggml : add IQ2 to test-backend-ops + refactoring (llama/4990)

227f2ae
unverified

ggml : importance matrix support for legacy quants (llama/4969)

d8bb9d8
unverified

Add ability to use importance matrix for all k-quants (llama/4930)

7032309
unverified

2-bit quantizations (llama/4897)

8a399ab
unverified

ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856)

5e827d5
unverified

SOTA 2-bit quants (llama/4773)

75de5bf
unverified

ggml : fix q2_k bpw in comments (ggml/680)

269f9a0
unverified

sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422)

7006035
unverified

Commit History

ggml : reuse quantum structs across backends (llama/5943) bb0625f unverified

Better 1.5 bit quantization (llama/5971) f3a62cc unverified

ggml : remove old quantization functions (llama/5942) 11a2545 unverified

ggml : add ggml-common.h to deduplicate shared code (llama/5940) 0a37735 unverified

ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (llama/5760) 9a07f42 unverified

IQ4_XS: a 4.25 bpw quantization (llama/5747) 0ee1bfb unverified

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721) 2b9bb9e unverified

IQ3_S: a much better alternative to Q3_K (llama/5676) 32589c9 unverified

sync : llama.cpp (ggml/0) f8e8d34 unverified

1.5 bit quantization (llama/5453) 9c3aa6a unverified

ggml : add mmla kernels for quantized GEMM (llama/4966) 0d50a29 unverified

ggml : make use of ggml-quants.h possible in C++ code (llama/5338) 963ade6 unverified

SOTA 3-bit quants (llama/5196) 4649943 unverified

ggml : add IQ2 to test-backend-ops + refactoring (llama/4990) 227f2ae unverified

ggml : importance matrix support for legacy quants (llama/4969) d8bb9d8 unverified

Add ability to use importance matrix for all k-quants (llama/4930) 7032309 unverified

2-bit quantizations (llama/4897) 8a399ab unverified

ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856) 5e827d5 unverified

SOTA 2-bit quants (llama/4773) 75de5bf unverified

ggml : fix q2_k bpw in comments (ggml/680) 269f9a0 unverified

sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422) 7006035 unverified

ggml : reuse quantum structs across backends (llama/5943)

bb0625f
unverified

Better 1.5 bit quantization (llama/5971)

f3a62cc
unverified

ggml : remove old quantization functions (llama/5942)

11a2545
unverified

ggml : add ggml-common.h to deduplicate shared code (llama/5940)

0a37735
unverified

ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (llama/5760)

9a07f42
unverified

IQ4_XS: a 4.25 bpw quantization (llama/5747)

0ee1bfb
unverified

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721)

2b9bb9e
unverified

IQ3_S: a much better alternative to Q3_K (llama/5676)

32589c9
unverified

sync : llama.cpp (ggml/0)

f8e8d34
unverified

1.5 bit quantization (llama/5453)

9c3aa6a
unverified

ggml : add mmla kernels for quantized GEMM (llama/4966)

0d50a29
unverified

ggml : make use of ggml-quants.h possible in C++ code (llama/5338)

963ade6
unverified

SOTA 3-bit quants (llama/5196)

4649943
unverified

ggml : add IQ2 to test-backend-ops + refactoring (llama/4990)

227f2ae
unverified

ggml : importance matrix support for legacy quants (llama/4969)

d8bb9d8
unverified

Add ability to use importance matrix for all k-quants (llama/4930)

7032309
unverified

2-bit quantizations (llama/4897)

8a399ab
unverified

ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856)

5e827d5
unverified

SOTA 2-bit quants (llama/4773)

75de5bf
unverified

ggml : fix q2_k bpw in comments (ggml/680)

269f9a0
unverified

sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422)

7006035
unverified