ggml : add abort_callback for cpu backend (ggml/725) a8ea91b unverified Michael Podvitskiy commited on Feb 9, 2024
server : allow CORS request with authorization headers (#1850) 16a6639 unverified Valentin Gosu commited on Feb 9, 2024
whisper.android : how to build with CLBlast (#1809) eea7f53 unverified lcfrs ggerganov commited on Feb 9, 2024
whisper : expose CUDA device setting in public API (#1840) d13ee66 unverified Didzis Gosko commited on Feb 9, 2024
make : add macOS deployment target option (#1839) 9c90601 unverified Didzis Gosko commited on Feb 9, 2024
ggml : fix IQ3_XXS on Metal (llama/5219) f066321 unverified Kawrakow ikawrakow commited on Jan 30, 2024
Faster AVX2 dot product for IQ2_XS (llama/5187) 187ae44 unverified Kawrakow ikawrakow PeterReid commited on Jan 30, 2024
ggml alloc: Fix for null dereference on alloc failure (llama/5200) 8181686 unverified Paul Tsochantaris commited on Jan 29, 2024
Nomic Vulkan backend (llama/4456) f5fd92d unverified Cebtenzzre niansa manyoso apage43 ToKiNoBug ggerganov slaren commited on Jan 29, 2024
ggml : add max buffer sizes to opencl and metal backends (llama/5181) 3d354d0 unverified slaren commited on Jan 29, 2024
metal : free metal objects (llama/5161) ea7167a unverified Paul Tsochantaris commited on Jan 28, 2024
`ggml_cuda_cpy` support for 4d tensors and float16->float32 upcasting (ggml/686) 75d438c unverified John Balis slaren commited on Jan 29, 2024
gguf : add input validation, prevent integer overflows (ggml/709) 5bf1614 unverified ggerganov commited on Jan 29, 2024
ci : fix yolo URLs + fix metal capture (ggml/712) 588f789 unverified ggerganov commited on Jan 29, 2024
metal : add debug capture backend function (ggml/694) ece88c3 unverified Jack Mousseau ggerganov commited on Jan 29, 2024
server : add fields to `verbose_json` response (#1802) 763d09d unverified JacobLinCool commited on Jan 30, 2024
ggml : add Vulkan backend (llama/2059) 5a97aba unverified OccamRazor SlyEcho Concedo slaren ggerganov commited on Jan 28, 2024
ggml : add unified SYCL backend for Intel GPUs (llama/2690) 01169e0 unverified Abhilash Majumder jianyuzh KevinLy hengyu ggerganov commited on Jan 28, 2024
cuda : fix tensor size calculation for non-split buffer (llama/5145) 8f3eb65 unverified slaren commited on Jan 26, 2024
ggml-alloc : add 10% margin to the buffer sizes (llama/5149) c55bdf8 unverified slaren commited on Jan 26, 2024
ggml : update softmax n_task calculation (llama/5126) 3a3eb8e unverified snadampal commited on Jan 26, 2024
metal : remove unused `n_buffers` and `buffers` (llama/5129) a3e87d3 unverified Paul Tsochantaris commited on Jan 26, 2024
cuda : fix 2-bit quants on amd hip (llama/5105) aadbd67 unverified Engininja2 commited on Jan 24, 2024
llama : pre-allocate input tensors in a separate buffer (llama/5100) 20a4ca1 unverified slaren commited on Jan 24, 2024
CUDA: more info when no device code (llama/5088) e96ba7d unverified JohannesGaessler commited on Jan 23, 2024
minor : clean-up some warnings and style (llama/5094) 7df090b unverified ggerganov commited on Jan 23, 2024
ggml : parallelize FP32 conversion when using BLAS (llama/5045) 7bf2c87 unverified reinforce20001 ggerganov commited on Jan 22, 2024
llava : MobileVLM support (llama/4954) dc8f956 unverified cxt123 Chenxiaotao03 commited on Jan 22, 2024
llama : run all KQV ops on the CPU with no KV offload (llama/5049) 97ce95c unverified slaren commited on Jan 20, 2024
cuda : fix compile error in jetson platform (llama/4975) 0935414 unverified Kylin commited on Jan 20, 2024
docs : make model options / model install methods clearer (#1806) a2bec1d unverified mikey-rrr commited on Jan 26, 2024
cmake : make libwhisper.so position independent (#1792) 1cf1553 unverified trixirt commited on Jan 22, 2024