Spaces:
Running
Running
Commit History
vulkan: support sqrt (llama/15370)
e5406c0
Dong Won Kim
commited on
vulkan: Optimize argsort (llama/15354)
80a188c
vulkan: fuse adds (llama/15252)
ad199b1
vulkan: Support mul_mat_id with f32 accumulators (llama/15337)
41a76e6
vulkan : fix out-of-bounds access in argmax kernel (llama/15342)
78a1865
vulkan : fix compile warnings on macos (llama/15340)
e3107ff
vulkan: perf_logger improvements (llama/15246)
d48d508
finetune: SGD optimizer, more CLI args (llama/13873)
f585fe7
ggml : fix field name when new ggml_backend (llama/14944)
685748d
AN Long
commited on
vulkan: support fattn sinks (llama/15126)
d7e9115
vulkan: Add env var to disable host visible vidmem (llama/15109)
5ec4382
vulkan: fix build when using glslang that does not support coopmat2 (llama/15062)
863e083
vulkan: Use coopmat2 for conv2d (llama/14982)
6df82f4
vulkan: coopmat2 mul_mat optimizations (llama/14934)
ca86566
vulkan: Support ne[3]>1 in noncontig matrix-vector multiply (llama/15015)
d4c4115
vulkan: optimizations for direct convolution (llama/14933)
215f463
Vulkan: Fix minor debug mode issues (llama/14899)
a81bc86
vulkan : fix 32-bit builds (ggml/1313)
96b66fd
Kai Pastor
commited on
vulkan : add fp16 support for the conv_2d kernel (llama/14872)
48e92ad
Erik Scholz
commited on
vulkan: skip empty set_rows to avoid invalid API usage (llama/14860)
22fb24a
vulkan: fix rms_norm_mul to handle broadcasting dim0 (llama/14817)
0c16b60
ggml: adds CONV_2D op and direct GEMM Vulkan implementation (llama/14316)
5885084
vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) (llama/14707)
0855a18
Peter0x44
commited on
vulkan: fix noncontig check for mat_mul_id splitting (llama/14683)
4d0d8b8
vulkan: add RTE variants for glu/add/sub/mul/div (llama/14653)
bac21a7
sync : resolve conflicts (ggml/0)
497add0
vulkan: support SET_ROWS (llama/14587)
9821f43
vulkan: optimizations for deepseek prompt processing (llama/14555)
04b631e
ggml : add ggml_scale_bias (llama/14417)
573d50a
vulkan: optimize flash attention split_k_reduce (llama/14554)
45fbb42
vulkan: fix rms_norm+mul fusion (llama/14545)
0791e65
vulkan: Handle updated FA dim2/3 definition (llama/14518)
d1e619e
ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)
f798922
Sigbjørn Skjæret
commited on
vulkan: support mixed/deepseekR1 FA head sizes (llama/14509)
90cefa0
kv-cache : use ggml_set_rows (llama/14285)
7d6d9e8
ggml : fix FA mask dim 2 and 3 (llama/14505)
a89dc81
vulkan: support softmax/FA batch and broadcast (llama/14449)
f6b0b76
ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435)
ebacb3e
vulkan: Split large mul_mat_id to fit in shared memory (llama/14451)
bf678f0
add GELU_ERF (llama/14455)
235ebf7
Sigbjørn Skjæret
commited on
vulkan : implement bilinear interpolation for ggml_upscale/ggml_interpolate (ggml/1291)
666e65b
vulkan : implement ggml_roll (ggml/1290)
968f9e8
ggml : implement REGLU/GEGLU/SWIGLU ops (llama/14158)
add5c0f
vulkan: Add fusion support for RMS_NORM+MUL (llama/14366)
737f12d
vulkan: handle noncontig in the final case of ggml_vk_get_cpy_pipeline (llama/14378)
1c3b94c
vulkan: lock accesses of pinned_memory vector (llama/14333)
59dca4f
Add support for VK_EXT_debug_utils to add labels to Vulkan objects. (llama/13792)
2c3741a
Markus Tavenrath
commited on