sycl: refactor quantization to q8_1 (llama/14815) 31edd77 Alberto Cabrera Pérez commited on Jul 28, 2025
node : add win platform check for require path (#3363) 29b8653 unverified danbev commited on Aug 15, 2025
whisper : fixed crash in GPU device selection on multi-GPU systems (#3372) 0869200 unverified Dw9 commited on Aug 12, 2025
stream.wasm : add language selection support (#3354) e8933c1 unverified danbev commited on Aug 2, 2025
whisper : reset conv scheduler when CoreML is used (#3350) f425556 unverified ggerganov commited on Jul 30, 2025
vulkan : add fp16 support for the conv_2d kernel (llama/14872) 48e92ad Erik Scholz commited on Jul 27, 2025
vulkan: skip empty set_rows to avoid invalid API usage (llama/14860) 22fb24a jeffbolznv commited on Jul 27, 2025
HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (llama/14624) 5422b31 deepsek commited on Jul 26, 2025
ggml-cpu : disable GGML_NNPA by default due to instability (llama/14880) cac085c taronaeo commited on Jul 25, 2025
ggml : remove invalid portPos specifiers from dot files (llama/14838) a91e2f3 ORippler commited on Jul 25, 2025
rpc : check for null buffers in get/set/copy tensor endpoints (llama/14868) 9a5c3ef ChrisRohlf commited on Jul 25, 2025
sched : fix multiple evaluations of the same graph with pipeline parallelism (llama/14855) e9f5612 Diego Devesa commited on Jul 25, 2025
sycl: fixed semantics of block offset calculation (llama/14814) d3d52a4 Alberto Cabrera Pérez commited on Jul 24, 2025
metal : fix fusion across different encoders (llama/14849) 17d67da ggerganov commited on Jul 24, 2025
sycl: fix undefined variable in work group size check (llama/14843) bcbbf47 Donghyeon Jeong commited on Jul 24, 2025
CUDA: fix overflow in FA, tune performance (llama/14840) 10ac92f JohannesGaessler commited on Jul 23, 2025
CUDA: fix compilation with GGML_CUDA_F16 (llama/14837) 2746afd JohannesGaessler commited on Jul 23, 2025
CUDA: fix quantized KV cache + multiple sequences (llama/14822) 88864af JohannesGaessler ggerganov commited on Jul 23, 2025
ggml: fix loongarch quantize_row_q8_1 error (llama/14827) 0bd2be3 lixing-star commited on Jul 23, 2025
vulkan: fix rms_norm_mul to handle broadcasting dim0 (llama/14817) 0c16b60 jeffbolznv commited on Jul 22, 2025
cuda : implement bf16 cpy ops and enable bf16 cont (llama/14763) b54b644 Sigbjørn Skjæret commited on Jul 22, 2025
ggml: adds CONV_2D op and direct GEMM Vulkan implementation (llama/14316) 5885084 etasnadi commited on Jul 19, 2025
vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) (llama/14707) 0855a18 Peter0x44 commited on Jul 19, 2025
Vulkan: Fix fprintf format-security warning (llama/14770) 77a1c11 OccamRazor commited on Jul 19, 2025