ruby : bug fix on callbacks and no_speech_prob (#2656) 39a4558 unverified KitaitiMakoto commited on Dec 21, 2024
server : add no-speech threshold parameter and functionality (#2654) 8e40db9 unverified sachaarbonel commited on Dec 21, 2024
whisper : rename suppress_non_speech_tokens to suppress_nst (#2653) 5b0631d unverified ggerganov commited on Dec 21, 2024
server : add option to suppress non-speech tokens (#2649) 647c7e7 unverified sachaarbonel commited on Dec 21, 2024
whisper : rename binaries + fix install (#2648) 30197de unverified ggerganov commited on Dec 21, 2024
ggml : update ggml_backend_cpu_device_supports_op (llama/10867) 2f11d1e ggerganov commited on Dec 17, 2024
vulkan: bugfixes for small subgroup size systems + llvmpipe test (llama/10809) 9220b51 Eve commited on Dec 17, 2024
rwkv6: add wkv6 support for Vulkan backend (llama/10829) c7285d6 Zhiyuan Li mollysama commited on Dec 16, 2024
llama : add Qwen2VL support + multimodal RoPE (llama/10361) 219d12b RzZ ggerganov commited on Dec 14, 2024
Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (llama/10693) 83a0899 lhez Skyler Szot Shangqing Gu Alexander Angus Hongqiang Wang Max Krasnyansky commited on Dec 13, 2024
Fix crash caused by ggml_backend_load_all when launching on Android Activity (llama/10812) e1df33d 谢乃闻 Diego Devesa commited on Dec 13, 2024
SYCL: Reduce most of the compiler warnings (llama/10748) 050e6ce qnixsynapse Abhilash Majumder commited on Dec 13, 2024
ggml : Fix compilation issues on ARM platform when building without fp16 (llama/10811) f76ba41 Karol Kontny commited on Dec 13, 2024
Vulkan: Use improved q4_k and q5_k dequant code in dequant shaders (llama/10798) a812efc OccamRazor commited on Dec 12, 2024
Vulkan: Add VK_EXT_subgroup_size_control support to ensure full subgroups for coopmats (llama/10721) 488f19e OccamRazor commited on Dec 12, 2024
ggml: load all backends from a user-provided search path (llama/10699) c6de218 Gilad S Diego Devesa commited on Dec 11, 2024
vulkan: request round-to-even for fp16 in im2col/rope_head (llama/10767) 461484c jeffbolznv commited on Dec 10, 2024
vulkan: dynamic subgroup size for the remaining k quants (llama/10745) 1bbdb81 Eve commited on Dec 10, 2024
CUDA: rename macros to avoid conflicts with WinAPI (llama/10736) 8544072 Andreas Kieslinger commited on Dec 10, 2024
vulkan: disable spirv-opt for coopmat shaders (llama/10763) 2ac53b2 jeffbolznv commited on Dec 10, 2024
ggml : remove return from ggml_gallocr_allocate_node (ggml/1048) f9d4408 danbev commited on Dec 14, 2024
CUDA: fix shared memory access condition for mmv (llama/10740) 99a4546 JohannesGaessler commited on Dec 9, 2024
Vulkan: fix NaN in tanh.comp with AMD proprietary driver on Windows (llama/10723) a618c84 stduhpf commited on Dec 8, 2024
vulkan: compile a test shader in cmake to check for coopmat2 support (llama/10713) 980eeb3 jeffbolznv commited on Dec 8, 2024
Vulkan: VK_KHR_cooperative_matrix support to speed up prompt processing (llama/10597) 9a4de04 OccamRazor commited on Dec 7, 2024
metal : Extend how Llama.cpp locates metal resources (llama/10676) 44e7250 Robert Ormandi ggerganov commited on Dec 7, 2024
vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (llama/10206) d10b47b jeffbolznv commited on Dec 5, 2024