Spaces:
Sleeping
Sleeping
Commit History
vulkan: Use larger workgroups for mul_mat_vec when M is small (llama/15355)
054584a
vulkan: support sqrt (llama/15370)
e5406c0
Dong Won Kim
commited on
vulkan: Optimize argsort (llama/15354)
80a188c
vulkan: fuse adds (llama/15252)
ad199b1
vulkan: Support mul_mat_id with f32 accumulators (llama/15337)
41a76e6
vulkan: Add missing bounds checking to scalar/coopmat1 mul_mat_id (llama/15334)
a6fa78e
vulkan : fix out-of-bounds access in argmax kernel (llama/15342)
78a1865
vulkan : fix compile warnings on macos (llama/15340)
e3107ff
vulkan: perf_logger improvements (llama/15246)
d48d508
finetune: SGD optimizer, more CLI args (llama/13873)
f585fe7
ggml : fix field name when new ggml_backend (llama/14944)
685748d
AN Long
commited on
vulkan: support fattn sinks (llama/15126)
d7e9115
vulkan: Add env var to disable host visible vidmem (llama/15109)
5ec4382
vulkan: fix build when using glslang that does not support coopmat2 (llama/15062)
863e083
vulkan: Use coopmat2 for conv2d (llama/14982)
6df82f4
vulkan: coopmat2 mul_mat optimizations (llama/14934)
ca86566
vulkan: Support ne[3]>1 in noncontig matrix-vector multiply (llama/15015)
d4c4115
vulkan: optimizations for direct convolution (llama/14933)
215f463
Vulkan: Fix minor debug mode issues (llama/14899)
a81bc86
vulkan : fix 32-bit builds (ggml/1313)
96b66fd
Kai Pastor
commited on
vulkan : add fp16 support for the conv_2d kernel (llama/14872)
48e92ad
Erik Scholz
commited on
vulkan: skip empty set_rows to avoid invalid API usage (llama/14860)
22fb24a
vulkan: fix rms_norm_mul to handle broadcasting dim0 (llama/14817)
0c16b60
vulkan/cuda: Fix im2col when KW!=KH (llama/14789)
0be0329
ggml: adds CONV_2D op and direct GEMM Vulkan implementation (llama/14316)
5885084
vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) (llama/14707)
0855a18
Peter0x44
commited on
Vulkan: Fix fprintf format-security warning (llama/14770)
77a1c11
vulkan: fix noncontig check for mat_mul_id splitting (llama/14683)
4d0d8b8
vulkan: add RTE variants for glu/add/sub/mul/div (llama/14653)
bac21a7
sync : resolve conflicts (ggml/0)
497add0
vulkan: support SET_ROWS (llama/14587)
9821f43
vulkan: optimizations for deepseek prompt processing (llama/14555)
04b631e
ggml : add ggml_scale_bias (llama/14417)
573d50a
vulkan: optimize flash attention split_k_reduce (llama/14554)
45fbb42
vulkan : fix rope with partial rotation and non-cont src (llama/14582)
367fa85
vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (llama/14485)
effd61f
Eve
Rémy Oudompheng
commited on
vulkan: fix rms_norm+mul fusion (llama/14545)
0791e65
vulkan: Handle updated FA dim2/3 definition (llama/14518)
d1e619e
ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)
f798922
Sigbjørn Skjæret
commited on
vulkan: support mixed/deepseekR1 FA head sizes (llama/14509)
90cefa0
kv-cache : use ggml_set_rows (llama/14285)
7d6d9e8
ggml : fix FA mask dim 2 and 3 (llama/14505)
a89dc81
vulkan: support softmax/FA batch and broadcast (llama/14449)
f6b0b76
ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435)
ebacb3e
vulkan: Split large mul_mat_id to fit in shared memory (llama/14451)
bf678f0
add GELU_ERF (llama/14455)
235ebf7
Sigbjørn Skjæret
commited on