Commits · natasa365/whisper.cpp

Basic Vulkan Multi-GPU implementation (llama/5321)

5d130aa
unverified

OccamRazor slaren commited on Feb 7, 2024

CUDA: mul_mat_vec_q max. batch size 8 -> 4 (llama/5370)

7aa3216
unverified

JohannesGaessler commited on Feb 6, 2024

Slight quantization improvement for Q4_K and Q5_K (llama/5361)

e3cd020
unverified

Kawrakow

ikawrakow commited on Feb 6, 2024

CUDA: mul_mat_vec_q for batch sizes > 1 (llama/5351)

ae45b38
unverified

JohannesGaessler commited on Feb 6, 2024

ggml : make use of ggml-quants.h possible in C++ code (llama/5338)

963ade6
unverified

Kawrakow

ikawrakow commited on Feb 5, 2024

ggml : avoid duplicating function calls using MIN/MAX macros (llama/5325)

9bb2b0a
unverified

Dr. Tom Murphy VII Ph.D

ggerganov commited on Feb 5, 2024

iq2_xxs: tune quantization (llama/5320)

11e5f6b
unverified

Kawrakow

ikawrakow commited on Feb 5, 2024

cuda : fix LLAMA_CUDA_F16 (llama/5262)

5fd8fb7
unverified

slaren commited on Feb 1, 2024

metal : add im2col F32 dst support (llama/5132)

26aec77
unverified

ggerganov commited on Jan 31, 2024

llava : add MobileVLM support (llama/5132)

f17a416
unverified

JidongZhang-THU slaren commited on Jan 31, 2024

ggml : limit n_threads to the max n_tasks (llama/5238)

2645c33
unverified

slaren commited on Jan 31, 2024

kompute : llama-bench support and ggml_cpu_has_kompute() (llama/5226)

0c9c434
unverified

Cebtenzzre commited on Jan 31, 2024

ggml : add abort_callback for cpu backend (ggml/725)

a8ea91b
unverified

Michael Podvitskiy commited on Feb 9, 2024

extra : update sync scripts

d99e873
unverified

ggerganov commited on Feb 10, 2024

server : allow CORS request with authorization headers (#1850)

16a6639
unverified

Valentin Gosu commited on Feb 9, 2024

whisper.android : how to build with CLBlast (#1809)

eea7f53
unverified

lcfrs

ggerganov commited on Feb 9, 2024

whisper : expose CUDA device setting in public API (#1840)

d13ee66
unverified

Didzis Gosko commited on Feb 9, 2024

make : add macOS deployment target option (#1839)

9c90601
unverified

Didzis Gosko commited on Feb 9, 2024

talk-llama : stream response (#1121)

2193f2b
unverified

ggerganov commited on Feb 6, 2024

sync : ggml (#0)

fded75b
unverified

ggerganov commited on Jan 30, 2024

ggml : fix IQ3_XXS on Metal (llama/5219)

f066321
unverified

Kawrakow

ikawrakow commited on Jan 30, 2024

sync : ggml (llama/0)

cdb7964
unverified

ggerganov commited on Jan 30, 2024

Faster AVX2 dot product for IQ2_XS (llama/5187)

187ae44
unverified

Kawrakow

ikawrakow

PeterReid commited on Jan 30, 2024

SOTA 3-bit quants (llama/5196)

4649943
unverified

Kawrakow

ikawrakow commited on Jan 30, 2024

ggml alloc: Fix for null dereference on alloc failure (llama/5200)

8181686
unverified

Paul Tsochantaris commited on Jan 29, 2024

Nomic Vulkan backend (llama/4456)

f5fd92d
unverified

Cebtenzzre niansa

manyoso

apage43 ToKiNoBug

ggerganov slaren commited on Jan 29, 2024

ggml : add max buffer sizes to opencl and metal backends (llama/5181)

3d354d0
unverified

slaren commited on Jan 29, 2024

metal : free metal objects (llama/5161)

ea7167a
unverified

Paul Tsochantaris commited on Jan 28, 2024

gguf : fix comparison (ggml/715)

80cfca4
unverified

ggerganov commited on Jan 29, 2024

`ggml_cuda_cpy` support for 4d tensors and float16->float32 upcasting (ggml/686)

75d438c
unverified

John Balis slaren commited on Jan 29, 2024

gguf : add input validation, prevent integer overflows (ggml/709)

5bf1614
unverified

ggerganov commited on Jan 29, 2024

ci : fix yolo URLs + fix metal capture (ggml/712)

588f789
unverified

ggerganov commited on Jan 29, 2024

metal : add debug capture backend function (ggml/694)

ece88c3
unverified

Jack Mousseau

ggerganov commited on Jan 29, 2024

common : fix wav buffer detection (#1819)

bc84057
unverified

JacobLinCool commited on Jan 30, 2024

server : add fields to `verbose_json` response (#1802)

763d09d
unverified

JacobLinCool commited on Jan 30, 2024

make : update MSYS_NT (#1813)

587152f
unverified

jwijffels commited on Jan 30, 2024

talk-llama : sync llama.cpp

1453539
unverified

ggerganov commited on Jan 28, 2024

sync : ggml

278a9b3
unverified

ggerganov commited on Jan 28, 2024

ggml : add Vulkan backend (llama/2059)

5a97aba
unverified

OccamRazor

SlyEcho Concedo slaren

ggerganov commited on Jan 28, 2024

ggml : add unified SYCL backend for Intel GPUs (llama/2690)

01169e0
unverified

Abhilash Majumder jianyuzh

KevinLy

hengyu

ggerganov commited on Jan 28, 2024

ggml : minor type fix (int64_t -> size_t)

1bbb1a9
unverified

ggerganov commited on Jan 28, 2024

common : fix input buffer check (#1812)

6c38a7f
unverified

ggerganov commited on Jan 27, 2024

talk-llama : sync llama.cpp

92cfd93
unverified

ggerganov commited on Jan 27, 2024

sync : ggml

5a9540e
unverified

ggerganov commited on Jan 27, 2024

Add OpenCL add kernel (llama/5151)

f833987
unverified

OccamRazor commited on Jan 26, 2024

cuda : fix tensor size calculation for non-split buffer (llama/5145)

8f3eb65
unverified

slaren commited on Jan 26, 2024

ggml-alloc : add 10% margin to the buffer sizes (llama/5149)

c55bdf8
unverified

slaren commited on Jan 26, 2024

ggml : update softmax n_task calculation (llama/5126)

3a3eb8e
unverified

snadampal commited on Jan 26, 2024

metal : remove unused `n_buffers` and `buffers` (llama/5129)

a3e87d3
unverified

Paul Tsochantaris commited on Jan 26, 2024

metal : show compile log messages

ae08f31
unverified

ggerganov commited on Jan 25, 2024

Commit History

Basic Vulkan Multi-GPU implementation (llama/5321) 5d130aa unverified

CUDA: mul_mat_vec_q max. batch size 8 -> 4 (llama/5370) 7aa3216 unverified

Slight quantization improvement for Q4_K and Q5_K (llama/5361) e3cd020 unverified

CUDA: mul_mat_vec_q for batch sizes > 1 (llama/5351) ae45b38 unverified

ggml : make use of ggml-quants.h possible in C++ code (llama/5338) 963ade6 unverified

ggml : avoid duplicating function calls using MIN/MAX macros (llama/5325) 9bb2b0a unverified

iq2_xxs: tune quantization (llama/5320) 11e5f6b unverified

cuda : fix LLAMA_CUDA_F16 (llama/5262) 5fd8fb7 unverified

metal : add im2col F32 dst support (llama/5132) 26aec77 unverified

llava : add MobileVLM support (llama/5132) f17a416 unverified

ggml : limit n_threads to the max n_tasks (llama/5238) 2645c33 unverified

kompute : llama-bench support and ggml_cpu_has_kompute() (llama/5226) 0c9c434 unverified

ggml : add abort_callback for cpu backend (ggml/725) a8ea91b unverified

extra : update sync scripts d99e873 unverified

server : allow CORS request with authorization headers (#1850) 16a6639 unverified

whisper.android : how to build with CLBlast (#1809) eea7f53 unverified

whisper : expose CUDA device setting in public API (#1840) d13ee66 unverified

make : add macOS deployment target option (#1839) 9c90601 unverified

talk-llama : stream response (#1121) 2193f2b unverified

sync : ggml (#0) fded75b unverified

ggml : fix IQ3_XXS on Metal (llama/5219) f066321 unverified

sync : ggml (llama/0) cdb7964 unverified

Faster AVX2 dot product for IQ2_XS (llama/5187) 187ae44 unverified

SOTA 3-bit quants (llama/5196) 4649943 unverified

ggml alloc: Fix for null dereference on alloc failure (llama/5200) 8181686 unverified

Nomic Vulkan backend (llama/4456) f5fd92d unverified

ggml : add max buffer sizes to opencl and metal backends (llama/5181) 3d354d0 unverified

metal : free metal objects (llama/5161) ea7167a unverified

gguf : fix comparison (ggml/715) 80cfca4 unverified

`ggml_cuda_cpy` support for 4d tensors and float16->float32 upcasting (ggml/686) 75d438c unverified

gguf : add input validation, prevent integer overflows (ggml/709) 5bf1614 unverified

ci : fix yolo URLs + fix metal capture (ggml/712) 588f789 unverified

metal : add debug capture backend function (ggml/694) ece88c3 unverified

common : fix wav buffer detection (#1819) bc84057 unverified

server : add fields to `verbose_json` response (#1802) 763d09d unverified

make : update MSYS_NT (#1813) 587152f unverified

talk-llama : sync llama.cpp 1453539 unverified

sync : ggml 278a9b3 unverified

ggml : add Vulkan backend (llama/2059) 5a97aba unverified

ggml : add unified SYCL backend for Intel GPUs (llama/2690) 01169e0 unverified

ggml : minor type fix (int64_t -> size_t) 1bbb1a9 unverified

common : fix input buffer check (#1812) 6c38a7f unverified

talk-llama : sync llama.cpp 92cfd93 unverified

sync : ggml 5a9540e unverified

Add OpenCL add kernel (llama/5151) f833987 unverified

cuda : fix tensor size calculation for non-split buffer (llama/5145) 8f3eb65 unverified

ggml-alloc : add 10% margin to the buffer sizes (llama/5149) c55bdf8 unverified

ggml : update softmax n_task calculation (llama/5126) 3a3eb8e unverified

metal : remove unused `n_buffers` and `buffers` (llama/5129) a3e87d3 unverified

metal : show compile log messages ae08f31 unverified

Basic Vulkan Multi-GPU implementation (llama/5321)

5d130aa
unverified

CUDA: mul_mat_vec_q max. batch size 8 -> 4 (llama/5370)

7aa3216
unverified

Slight quantization improvement for Q4_K and Q5_K (llama/5361)

e3cd020
unverified

CUDA: mul_mat_vec_q for batch sizes > 1 (llama/5351)

ae45b38
unverified

ggml : make use of ggml-quants.h possible in C++ code (llama/5338)

963ade6
unverified

ggml : avoid duplicating function calls using MIN/MAX macros (llama/5325)

9bb2b0a
unverified

iq2_xxs: tune quantization (llama/5320)

11e5f6b
unverified

cuda : fix LLAMA_CUDA_F16 (llama/5262)

5fd8fb7
unverified

metal : add im2col F32 dst support (llama/5132)

26aec77
unverified

llava : add MobileVLM support (llama/5132)

f17a416
unverified

ggml : limit n_threads to the max n_tasks (llama/5238)

2645c33
unverified

kompute : llama-bench support and ggml_cpu_has_kompute() (llama/5226)

0c9c434
unverified

ggml : add abort_callback for cpu backend (ggml/725)

a8ea91b
unverified

extra : update sync scripts

d99e873
unverified

server : allow CORS request with authorization headers (#1850)

16a6639
unverified

whisper.android : how to build with CLBlast (#1809)

eea7f53
unverified

whisper : expose CUDA device setting in public API (#1840)

d13ee66
unverified

make : add macOS deployment target option (#1839)

9c90601
unverified

talk-llama : stream response (#1121)

2193f2b
unverified

sync : ggml (#0)

fded75b
unverified

ggml : fix IQ3_XXS on Metal (llama/5219)

f066321
unverified

sync : ggml (llama/0)

cdb7964
unverified

Faster AVX2 dot product for IQ2_XS (llama/5187)

187ae44
unverified

SOTA 3-bit quants (llama/5196)

4649943
unverified

ggml alloc: Fix for null dereference on alloc failure (llama/5200)

8181686
unverified

Nomic Vulkan backend (llama/4456)

f5fd92d
unverified

ggml : add max buffer sizes to opencl and metal backends (llama/5181)

3d354d0
unverified

metal : free metal objects (llama/5161)

ea7167a
unverified

gguf : fix comparison (ggml/715)

80cfca4
unverified

`ggml_cuda_cpy` support for 4d tensors and float16->float32 upcasting (ggml/686)

75d438c
unverified

gguf : add input validation, prevent integer overflows (ggml/709)

5bf1614
unverified

ci : fix yolo URLs + fix metal capture (ggml/712)

588f789
unverified

metal : add debug capture backend function (ggml/694)

ece88c3
unverified

common : fix wav buffer detection (#1819)

bc84057
unverified

server : add fields to `verbose_json` response (#1802)

763d09d
unverified

make : update MSYS_NT (#1813)

587152f
unverified

talk-llama : sync llama.cpp

1453539
unverified

sync : ggml

278a9b3
unverified

ggml : add Vulkan backend (llama/2059)

5a97aba
unverified

ggml : add unified SYCL backend for Intel GPUs (llama/2690)

01169e0
unverified

ggml : minor type fix (int64_t -> size_t)

1bbb1a9
unverified

common : fix input buffer check (#1812)

6c38a7f
unverified

talk-llama : sync llama.cpp

92cfd93
unverified

sync : ggml

5a9540e
unverified

Add OpenCL add kernel (llama/5151)

f833987
unverified

cuda : fix tensor size calculation for non-split buffer (llama/5145)

8f3eb65
unverified

ggml-alloc : add 10% margin to the buffer sizes (llama/5149)

c55bdf8
unverified

ggml : update softmax n_task calculation (llama/5126)

3a3eb8e
unverified

metal : remove unused `n_buffers` and `buffers` (llama/5129)

a3e87d3
unverified

metal : show compile log messages

ae08f31
unverified