Spaces:
Sleeping
Sleeping
Commit History
cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129)
f959b90
cmdr2
commited on
vulkan: implement several ops relevant for ggml_opt (llama/11769)
3c2171d
Rémy O
commited on
vulkan: support multi/vision rope, and noncontiguous rope (llama/11902)
1c7a669
vulkan: initial support for IQ1_S and IQ1_M quantizations (llama/11528)
0d2e888
Rémy O
commited on
vulkan: linux builds + small subgroup size fixes (llama/11767)
e3f0e78
Eve
commited on
vulkan: Make Vulkan optional at runtime (ggml/11493). (llama/11494)
762f497
vulkan: add environment variable GGML_VK_PREFER_HOST_MEMORY to avoid VRAM allocation (llama/11592)
f9fd130
Wagner Bruna
commited on
vulkan: account for lookup tables when checking shared memory size (llama/11502)
758970f
vulkan: print shared memory size (llama/11719)
fb33a94
vulkan: initial support for IQ4_XS quantization (llama/11501)
ed46ad5
Rémy O
commited on
vulkan: use smaller combined allocations to avoid fragmentation (llama/11551)
1b7672d
CUDA: non-contiguous (RMS) norm support (llama/11659)
4c2e171
vulkan: implement initial support for IQ2 and IQ3 quantizations (llama/11360)
bd93c1b
vulkan: Catch pipeline creation failure and print an error message (llama/11436)
d4f6b2c
vulkan: compile shaders on-demand (llama/11406)
5c008f7
Vulkan-run-test: fix mmq_wg_denoms (llama/11343)
133a580
amd-dwang
commited on
vulkan: fix diag_mask_inf (llama/11323)
f76204e
vulkan: fix coopmat2 validation failures (llama/11284)
f2cc7e9
vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama/11281)
e0e73fa
vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (llama/11166)
3bb9e77
Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (llama/11161)
5ad3f1d
Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (llama/11117)
623b74d
Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver (llama/11074)
4d90c3d
vulkan: optimize mul_mat for small values of N (llama/10991)
5fc8eea
vulkan: im2col and matmul optimizations for stable diffusion (llama/10942)
beef268
vulkan: Use push constant offset to handle misaligned descriptors (llama/10987)
04e729a
vulkan: multi-row k quants (llama/10846)
3bf5be1
Eve
commited on
vulkan: build fixes for 32b (llama/10927)
f1e76ce
vulkan: bugfixes for small subgroup size systems + llvmpipe test (llama/10809)
9220b51
Eve
commited on
rwkv6: add wkv6 support for Vulkan backend (llama/10829)
c7285d6
vulkan: small mul_mat_vec optimizations (llama/10665)
ec98109
Eve
commited on
Vulkan: Add VK_EXT_subgroup_size_control support to ensure full subgroups for coopmats (llama/10721)
488f19e
vulkan: request round-to-even for fp16 in im2col/rope_head (llama/10767)
461484c
vulkan: dynamic subgroup size for the remaining k quants (llama/10745)
1bbdb81
Eve
commited on
vulkan: disable spirv-opt for coopmat shaders (llama/10763)
2ac53b2
vulkan: fix compile warnings (llama/10731)
cdcb67c
vulkan: compile a test shader in cmake to check for coopmat2 support (llama/10713)
980eeb3
Vulkan: VK_KHR_cooperative_matrix support to speed up prompt processing (llama/10597)
9a4de04
vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (llama/10206)
d10b47b
vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (llama/10642)
e9ee893
vulkan: optimize and reenable split_k (llama/10637)
bca95f5
vulkan: Dynamic subgroup size support for Q6_K mat_vec (llama/10536)
59600b5
Eve
commited on