Commits · Xenobd/whisper.cpp

vulkan: implement initial support for IQ2 and IQ3 quantizations (llama/11360)

bd93c1b

Rémy Oudompheng

jeffbolznv commited on Jan 29

vulkan: Catch pipeline creation failure and print an error message (llama/11436)

d4f6b2c

jeffbolznv commited on Jan 29

vulkan: compile shaders on-demand (llama/11406)

5c008f7

jeffbolznv commited on Jan 25

Vulkan-run-test: fix mmq_wg_denoms (llama/11343)

133a580

amd-dwang commited on Jan 23

vulkan: fix diag_mask_inf (llama/11323)

f76204e

jeffbolznv commited on Jan 23

vulkan: fix coopmat2 validation failures (llama/11284)

f2cc7e9

jeffbolznv commited on Jan 20

vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama/11281)

e0e73fa

jeffbolznv commited on Jan 18

vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (llama/11166)

3bb9e77

jeffbolznv commited on Jan 16

Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (llama/11161)

5ad3f1d

OccamRazor commited on Jan 10

llama: add support for QRWKV6 model architecture (llama/11001)

4a6b7e0

mollysama

ggerganov

compilade commited on Jan 10

Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (llama/11117)

623b74d

mbaudier commited on Jan 8

Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver (llama/11074)

4d90c3d

OccamRazor commited on Jan 4

vulkan: optimize mul_mat for small values of N (llama/10991)

5fc8eea

jeffbolznv commited on Dec 30, 2024

vulkan: im2col and matmul optimizations for stable diffusion (llama/10942)

beef268

jeffbolznv commited on Dec 29, 2024

vulkan: Use push constant offset to handle misaligned descriptors (llama/10987)

04e729a

jeffbolznv commited on Dec 29, 2024

vulkan: multi-row k quants (llama/10846)

3bf5be1

Eve commited on Dec 26, 2024

vulkan: build fixes for 32b (llama/10927)

f1e76ce

jeffbolznv commited on Dec 22, 2024

vulkan: bugfixes for small subgroup size systems + llvmpipe test (llama/10809)

9220b51

Eve commited on Dec 17, 2024

rwkv6: add wkv6 support for Vulkan backend (llama/10829)

c7285d6

Zhiyuan Li

mollysama commited on Dec 16, 2024

llama : add Qwen2VL support + multimodal RoPE (llama/10361)

219d12b

RzZ

ggerganov commited on Dec 14, 2024

vulkan: small mul_mat_vec optimizations (llama/10665)

ec98109

Eve commited on Dec 13, 2024

Vulkan: Add VK_EXT_subgroup_size_control support to ensure full subgroups for coopmats (llama/10721)

488f19e

OccamRazor commited on Dec 12, 2024

vulkan: request round-to-even for fp16 in im2col/rope_head (llama/10767)

461484c

jeffbolznv commited on Dec 10, 2024

vulkan: dynamic subgroup size for the remaining k quants (llama/10745)

1bbdb81

Eve commited on Dec 10, 2024

vulkan: disable spirv-opt for coopmat shaders (llama/10763)

2ac53b2

jeffbolznv commited on Dec 10, 2024

vulkan: fix compile warnings (llama/10731)

cdcb67c

jeffbolznv commited on Dec 9, 2024

vulkan: compile a test shader in cmake to check for coopmat2 support (llama/10713)

980eeb3

jeffbolznv commited on Dec 8, 2024

Vulkan: VK_KHR_cooperative_matrix support to speed up prompt processing (llama/10597)

9a4de04

OccamRazor commited on Dec 7, 2024

vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (llama/10206)

d10b47b

jeffbolznv commited on Dec 5, 2024

vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (llama/10642)

e9ee893

jeffbolznv commited on Dec 4, 2024

vulkan: optimize and reenable split_k (llama/10637)

bca95f5

jeffbolznv commited on Dec 3, 2024

vulkan: Dynamic subgroup size support for Q6_K mat_vec (llama/10536)

59600b5

Eve commited on Nov 30, 2024

vulkan: get the first command buffer submitted sooner (llama/10499)

e1c1e73

jeffbolznv commited on Nov 29, 2024

vulkan: Handle GPUs with less shared memory (llama/10468)

18a0ad1

jeffbolznv commited on Nov 27, 2024

vulkan: fix group_norm (llama/10496)

8f5eeb8

jeffbolznv commited on Nov 26, 2024

ggml : add support for dynamic loading of backends (llama/10469)

b73266f

Diego Devesa

ggerganov commited on Nov 25, 2024

vulkan: further optimize mul_mat_vec using larger loads (llama/10387)

50a2978

jeffbolznv commited on Nov 20, 2024

vulkan: Optimize soft_max (llama/10301)

5cb851d

jeffbolznv commited on Nov 19, 2024

Vulkan: Fix device info output format specifiers (llama/10366)

8000df9

OccamRazor commited on Nov 18, 2024

vulkan: Optimize some mat-vec mul quant shaders (llama/10296)

dc0e685

jeffbolznv commited on Nov 16, 2024

sync : leftovers (ggml/0)

0f6c498

ggerganov commited on Nov 15, 2024

ggml : build backends as libraries (llama/10256)

3dc93f3

Diego Devesa

ggerganov R0CKSTAR commited on Nov 14, 2024

Commit History

vulkan: implement initial support for IQ2 and IQ3 quantizations (llama/11360) bd93c1b

vulkan: Catch pipeline creation failure and print an error message (llama/11436) d4f6b2c

vulkan: compile shaders on-demand (llama/11406) 5c008f7

Vulkan-run-test: fix mmq_wg_denoms (llama/11343) 133a580

vulkan: fix diag_mask_inf (llama/11323) f76204e

vulkan: fix coopmat2 validation failures (llama/11284) f2cc7e9

vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama/11281) e0e73fa

vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (llama/11166) 3bb9e77

Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (llama/11161) 5ad3f1d

llama: add support for QRWKV6 model architecture (llama/11001) 4a6b7e0

Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (llama/11117) 623b74d

Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver (llama/11074) 4d90c3d

vulkan: optimize mul_mat for small values of N (llama/10991) 5fc8eea

vulkan: im2col and matmul optimizations for stable diffusion (llama/10942) beef268

vulkan: Use push constant offset to handle misaligned descriptors (llama/10987) 04e729a

vulkan: multi-row k quants (llama/10846) 3bf5be1

vulkan: build fixes for 32b (llama/10927) f1e76ce

vulkan: bugfixes for small subgroup size systems + llvmpipe test (llama/10809) 9220b51

rwkv6: add wkv6 support for Vulkan backend (llama/10829) c7285d6

llama : add Qwen2VL support + multimodal RoPE (llama/10361) 219d12b

vulkan: small mul_mat_vec optimizations (llama/10665) ec98109

Vulkan: Add VK_EXT_subgroup_size_control support to ensure full subgroups for coopmats (llama/10721) 488f19e

vulkan: request round-to-even for fp16 in im2col/rope_head (llama/10767) 461484c

vulkan: dynamic subgroup size for the remaining k quants (llama/10745) 1bbdb81

vulkan: disable spirv-opt for coopmat shaders (llama/10763) 2ac53b2

vulkan: fix compile warnings (llama/10731) cdcb67c

vulkan: compile a test shader in cmake to check for coopmat2 support (llama/10713) 980eeb3

Vulkan: VK_KHR_cooperative_matrix support to speed up prompt processing (llama/10597) 9a4de04

vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (llama/10206) d10b47b

vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (llama/10642) e9ee893

vulkan: optimize and reenable split_k (llama/10637) bca95f5

vulkan: Dynamic subgroup size support for Q6_K mat_vec (llama/10536) 59600b5

vulkan: get the first command buffer submitted sooner (llama/10499) e1c1e73

vulkan: Handle GPUs with less shared memory (llama/10468) 18a0ad1

vulkan: fix group_norm (llama/10496) 8f5eeb8

ggml : add support for dynamic loading of backends (llama/10469) b73266f

vulkan: further optimize mul_mat_vec using larger loads (llama/10387) 50a2978

vulkan: Optimize soft_max (llama/10301) 5cb851d

Vulkan: Fix device info output format specifiers (llama/10366) 8000df9

vulkan: Optimize some mat-vec mul quant shaders (llama/10296) dc0e685

sync : leftovers (ggml/0) 0f6c498

ggml : build backends as libraries (llama/10256) 3dc93f3