whisper.cpp / ggml /src /ggml-vulkan /ggml-vulkan.cpp

Commit History

vulkan: Use larger workgroups for mul_mat_vec when M is small (llama/15355)
054584a

jeffbolznv OccamRazor commited on

vulkan: support sqrt (llama/15370)
e5406c0

Dong Won Kim commited on

vulkan: Optimize argsort (llama/15354)
80a188c

jeffbolznv commited on

vulkan: fuse adds (llama/15252)
ad199b1

jeffbolznv commited on

vulkan: Support mul_mat_id with f32 accumulators (llama/15337)
41a76e6

jeffbolznv commited on

vulkan : fix out-of-bounds access in argmax kernel (llama/15342)
78a1865

ggerganov commited on

vulkan : fix compile warnings on macos (llama/15340)
e3107ff

ggerganov commited on

vulkan: perf_logger improvements (llama/15246)
d48d508

jeffbolznv commited on

ggml : fix field name when new ggml_backend (llama/14944)
685748d

AN Long commited on

vulkan: support fattn sinks (llama/15126)
d7e9115

jeffbolznv commited on

vulkan: Add env var to disable host visible vidmem (llama/15109)
5ec4382

jeffbolznv commited on

llama : add gpt-oss (llama/15091)
bf225d6

ggerganov ngxson HF Staff slaren commited on

vulkan: fix build when using glslang that does not support coopmat2 (llama/15062)
863e083

jeffbolznv commited on

vulkan: Use coopmat2 for conv2d (llama/14982)
6df82f4

jeffbolznv commited on

vulkan: coopmat2 mul_mat optimizations (llama/14934)
ca86566

jeffbolznv commited on

vulkan: Support ne[3]>1 in noncontig matrix-vector multiply (llama/15015)
d4c4115

jeffbolznv commited on

Vulkan: Fix minor debug mode issues (llama/14899)
a81bc86

OccamRazor commited on

vulkan : fix 32-bit builds (ggml/1313)
96b66fd

Kai Pastor commited on

vulkan : add fp16 support for the conv_2d kernel (llama/14872)
48e92ad

Erik Scholz commited on

vulkan: skip empty set_rows to avoid invalid API usage (llama/14860)
22fb24a

jeffbolznv commited on

vulkan: fix rms_norm_mul to handle broadcasting dim0 (llama/14817)
0c16b60

jeffbolznv commited on

ggml: adds CONV_2D op and direct GEMM Vulkan implementation (llama/14316)
5885084

etasnadi commited on

vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) (llama/14707)
0855a18

Peter0x44 commited on

vulkan: fix noncontig check for mat_mul_id splitting (llama/14683)
4d0d8b8

jeffbolznv commited on

vulkan: add RTE variants for glu/add/sub/mul/div (llama/14653)
bac21a7

jeffbolznv commited on

sync : resolve conflicts (ggml/0)
497add0

ggerganov commited on

vulkan: support SET_ROWS (llama/14587)
9821f43

jeffbolznv commited on

vulkan: optimizations for deepseek prompt processing (llama/14555)
04b631e

jeffbolznv commited on

ggml : add ggml_scale_bias (llama/14417)
573d50a

ngxson HF Staff commited on

vulkan: optimize flash attention split_k_reduce (llama/14554)
45fbb42

jeffbolznv commited on

vulkan: fix rms_norm+mul fusion (llama/14545)
0791e65

jeffbolznv commited on

vulkan: Handle updated FA dim2/3 definition (llama/14518)
d1e619e

jeffbolznv commited on

ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)
f798922

Sigbjørn Skjæret commited on

vulkan: support mixed/deepseekR1 FA head sizes (llama/14509)
90cefa0

jeffbolznv commited on

kv-cache : use ggml_set_rows (llama/14285)
7d6d9e8

ggerganov commited on

ggml : fix FA mask dim 2 and 3 (llama/14505)
a89dc81

ggerganov commited on

vulkan: support softmax/FA batch and broadcast (llama/14449)
f6b0b76

jeffbolznv commited on

ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435)
ebacb3e

ggerganov commited on

vulkan: Split large mul_mat_id to fit in shared memory (llama/14451)
bf678f0

jeffbolznv commited on

add GELU_ERF (llama/14455)
235ebf7

Sigbjørn Skjæret commited on

vulkan : implement bilinear interpolation for ggml_upscale/ggml_interpolate (ggml/1291)
666e65b

Acly commited on

vulkan : implement ggml_roll (ggml/1290)
968f9e8

Acly commited on

vulkan: Add fusion support for RMS_NORM+MUL (llama/14366)
737f12d

jeffbolznv slaren commited on

vulkan: handle noncontig in the final case of ggml_vk_get_cpy_pipeline (llama/14378)
1c3b94c

jeffbolznv commited on

vulkan: lock accesses of pinned_memory vector (llama/14333)
59dca4f

jeffbolznv commited on

Add support for VK_EXT_debug_utils to add labels to Vulkan objects. (llama/13792)
2c3741a

Markus Tavenrath commited on

Vulkan: Set device max size for host memory to avoid OOM warning and fallback to CPU buffer (llama/14249)
08debcd

OccamRazor commited on