Commits · Xenobd/whisper.cpp

vulkan : sync (llama/0)

4c17fa1

ggerganov commited on Mar 4

ggml : upgrade init_tensor API to return a ggml_status (llama/11854)

d6b6852

William Tambellini slaren commited on Feb 28

vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (llama/11595)

d7d82b9

Rémy O commited on Feb 28

vulkan: matmul dequantization improvements (llama/12015)

ffdf466

Eve commited on Feb 28

vulkan: improve im2col (llama/11826)

f6cff0a

Daniele commited on Feb 28

vulkan: fix assertion when qy_needs_dequant (llama/12068)

271c7e4

jeffbolznv commited on Feb 25

cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129)

f959b90

cmdr2 commited on Feb 28

vulkan: implement several ops relevant for ggml_opt (llama/11769)

3c2171d

Rémy O commited on Feb 17

vulkan: support multi/vision rope, and noncontiguous rope (llama/11902)

1c7a669

jeffbolznv commited on Feb 16

vulkan: initial support for IQ1_S and IQ1_M quantizations (llama/11528)

0d2e888

Rémy O commited on Feb 15

vulkan: linux builds + small subgroup size fixes (llama/11767)

e3f0e78

Eve commited on Feb 14

vulkan: Make Vulkan optional at runtime (ggml/11493). (llama/11494)

762f497

Danny Milosavljevic

jeffbolznv commited on Feb 10

vulkan: add environment variable GGML_VK_PREFER_HOST_MEMORY to avoid VRAM allocation (llama/11592)

f9fd130

Wagner Bruna commited on Feb 10

vulkan: account for lookup tables when checking shared memory size (llama/11502)

758970f

jeffbolznv commited on Feb 9

vulkan: print shared memory size (llama/11719)

fb33a94

jeffbolznv commited on Feb 7

vulkan: optimize coopmat2 iq2/iq3 callbacks (llama/11521)

3731f13

jeffbolznv commited on Feb 6

vulkan: initial support for IQ4_XS quantization (llama/11501)

ed46ad5

Rémy O commited on Feb 6

vulkan: use smaller combined allocations to avoid fragmentation (llama/11551)

1b7672d

jeffbolznv commited on Feb 6

CUDA: non-contiguous (RMS) norm support (llama/11659)

4c2e171

JohannesGaessler

ggerganov commited on Feb 4

vulkan: implement initial support for IQ2 and IQ3 quantizations (llama/11360)

bd93c1b

Rémy Oudompheng

jeffbolznv commited on Jan 29

vulkan: Catch pipeline creation failure and print an error message (llama/11436)

d4f6b2c

jeffbolznv commited on Jan 29

vulkan: compile shaders on-demand (llama/11406)

5c008f7

jeffbolznv commited on Jan 25

Vulkan-run-test: fix mmq_wg_denoms (llama/11343)

133a580

amd-dwang commited on Jan 23

vulkan: sort shaders for more deterministic binary (llama/11315)

d7c0046

jeffbolznv commited on Jan 23

vulkan: fix diag_mask_inf (llama/11323)

f76204e

jeffbolznv commited on Jan 23

vulkan: fix coopmat2 validation failures (llama/11284)

f2cc7e9

jeffbolznv commited on Jan 20

vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama/11281)

e0e73fa

jeffbolznv commited on Jan 18

vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (llama/11166)

3bb9e77

jeffbolznv commited on Jan 16

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (llama/11206)

ee122d3

jeffbolznv commited on Jan 16

vulkan: optimize coopmat2 q2_k dequant function (llama/11130)

d49a569

jeffbolznv commited on Jan 16

vulkan: scale caching for k quants + misc fixes (llama/11081)

03ab36f

Eve commited on Jan 15

fix: ggml: fix vulkan-shaders-gen build (llama/10448)

ad8f031

Sparkleholic commited on Jan 15

Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (llama/11161)

5ad3f1d

OccamRazor commited on Jan 10

llama: add support for QRWKV6 model architecture (llama/11001)

4a6b7e0

mollysama

ggerganov

compilade commited on Jan 10

Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (llama/11117)

623b74d

mbaudier commited on Jan 8

fix: Vulkan shader gen binary path when Cross-compiling (llama/11096)

966a7bb

ag2s20150909 commited on Jan 8

Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver (llama/11074)

4d90c3d

OccamRazor commited on Jan 4

fix: Vulkan shader gen binary path (llama/11037)

7008fb8

Gilad S. commited on Jan 4

vulkan: optimize mul_mat for small values of N (llama/10991)

5fc8eea

jeffbolznv commited on Dec 30, 2024

vulkan: im2col and matmul optimizations for stable diffusion (llama/10942)

beef268

jeffbolznv commited on Dec 29, 2024

vulkan: Use push constant offset to handle misaligned descriptors (llama/10987)

04e729a

jeffbolznv commited on Dec 29, 2024

vulkan: multi-row k quants (llama/10846)

3bf5be1

Eve commited on Dec 26, 2024

examples, ggml : fix GCC compiler warnings (llama/10983)

d7cf559

Peter commited on Dec 26, 2024

vulkan: build fixes for 32b (llama/10927)

f1e76ce

jeffbolznv commited on Dec 22, 2024

vulkan: optimize coopmat2 dequant functions (llama/10855)

5e70c43

jeffbolznv commited on Dec 21, 2024

vulkan: bugfixes for small subgroup size systems + llvmpipe test (llama/10809)

9220b51

Eve commited on Dec 17, 2024

rwkv6: add wkv6 support for Vulkan backend (llama/10829)

c7285d6

Zhiyuan Li

mollysama commited on Dec 16, 2024

llama : add Qwen2VL support + multimodal RoPE (llama/10361)

219d12b

RzZ

ggerganov commited on Dec 14, 2024

vulkan: small mul_mat_vec optimizations (llama/10665)

ec98109

Eve commited on Dec 13, 2024

Vulkan: Use improved q4_k and q5_k dequant code in dequant shaders (llama/10798)

a812efc

OccamRazor commited on Dec 12, 2024

Commit History

vulkan : sync (llama/0) 4c17fa1

ggml : upgrade init_tensor API to return a ggml_status (llama/11854) d6b6852

vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (llama/11595) d7d82b9

vulkan: matmul dequantization improvements (llama/12015) ffdf466

vulkan: improve im2col (llama/11826) f6cff0a

vulkan: fix assertion when qy_needs_dequant (llama/12068) 271c7e4

cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129) f959b90

vulkan: implement several ops relevant for ggml_opt (llama/11769) 3c2171d

vulkan: support multi/vision rope, and noncontiguous rope (llama/11902) 1c7a669

vulkan: initial support for IQ1_S and IQ1_M quantizations (llama/11528) 0d2e888

vulkan: linux builds + small subgroup size fixes (llama/11767) e3f0e78

vulkan: Make Vulkan optional at runtime (ggml/11493). (llama/11494) 762f497

vulkan: add environment variable GGML_VK_PREFER_HOST_MEMORY to avoid VRAM allocation (llama/11592) f9fd130

vulkan: account for lookup tables when checking shared memory size (llama/11502) 758970f

vulkan: print shared memory size (llama/11719) fb33a94

vulkan: optimize coopmat2 iq2/iq3 callbacks (llama/11521) 3731f13

vulkan: initial support for IQ4_XS quantization (llama/11501) ed46ad5

vulkan: use smaller combined allocations to avoid fragmentation (llama/11551) 1b7672d

CUDA: non-contiguous (RMS) norm support (llama/11659) 4c2e171

vulkan: implement initial support for IQ2 and IQ3 quantizations (llama/11360) bd93c1b

vulkan: Catch pipeline creation failure and print an error message (llama/11436) d4f6b2c

vulkan: compile shaders on-demand (llama/11406) 5c008f7

Vulkan-run-test: fix mmq_wg_denoms (llama/11343) 133a580

vulkan: sort shaders for more deterministic binary (llama/11315) d7c0046

vulkan: fix diag_mask_inf (llama/11323) f76204e

vulkan: fix coopmat2 validation failures (llama/11284) f2cc7e9

vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama/11281) e0e73fa

vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (llama/11166) 3bb9e77

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (llama/11206) ee122d3

vulkan: optimize coopmat2 q2_k dequant function (llama/11130) d49a569

vulkan: scale caching for k quants + misc fixes (llama/11081) 03ab36f

fix: ggml: fix vulkan-shaders-gen build (llama/10448) ad8f031

Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (llama/11161) 5ad3f1d

llama: add support for QRWKV6 model architecture (llama/11001) 4a6b7e0

Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (llama/11117) 623b74d

fix: Vulkan shader gen binary path when Cross-compiling (llama/11096) 966a7bb

Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver (llama/11074) 4d90c3d

fix: Vulkan shader gen binary path (llama/11037) 7008fb8

vulkan: optimize mul_mat for small values of N (llama/10991) 5fc8eea

vulkan: im2col and matmul optimizations for stable diffusion (llama/10942) beef268

vulkan: Use push constant offset to handle misaligned descriptors (llama/10987) 04e729a

vulkan: multi-row k quants (llama/10846) 3bf5be1

examples, ggml : fix GCC compiler warnings (llama/10983) d7cf559

vulkan: build fixes for 32b (llama/10927) f1e76ce

vulkan: optimize coopmat2 dequant functions (llama/10855) 5e70c43

vulkan: bugfixes for small subgroup size systems + llvmpipe test (llama/10809) 9220b51

rwkv6: add wkv6 support for Vulkan backend (llama/10829) c7285d6

llama : add Qwen2VL support + multimodal RoPE (llama/10361) 219d12b

vulkan: small mul_mat_vec optimizations (llama/10665) ec98109

Vulkan: Use improved q4_k and q5_k dequant code in dequant shaders (llama/10798) a812efc

vulkan : sync (llama/0)

4c17fa1

ggml : upgrade init_tensor API to return a ggml_status (llama/11854)

d6b6852

vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (llama/11595)

d7d82b9

vulkan: matmul dequantization improvements (llama/12015)

ffdf466

vulkan: improve im2col (llama/11826)

f6cff0a

vulkan: fix assertion when qy_needs_dequant (llama/12068)

271c7e4

cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129)

f959b90

vulkan: implement several ops relevant for ggml_opt (llama/11769)

3c2171d

vulkan: support multi/vision rope, and noncontiguous rope (llama/11902)

1c7a669

vulkan: initial support for IQ1_S and IQ1_M quantizations (llama/11528)

0d2e888

vulkan: linux builds + small subgroup size fixes (llama/11767)

e3f0e78

vulkan: Make Vulkan optional at runtime (ggml/11493). (llama/11494)

762f497

vulkan: add environment variable GGML_VK_PREFER_HOST_MEMORY to avoid VRAM allocation (llama/11592)

f9fd130

vulkan: account for lookup tables when checking shared memory size (llama/11502)

758970f

vulkan: print shared memory size (llama/11719)

fb33a94

vulkan: optimize coopmat2 iq2/iq3 callbacks (llama/11521)

3731f13

vulkan: initial support for IQ4_XS quantization (llama/11501)

ed46ad5

vulkan: use smaller combined allocations to avoid fragmentation (llama/11551)

1b7672d

CUDA: non-contiguous (RMS) norm support (llama/11659)

4c2e171

vulkan: implement initial support for IQ2 and IQ3 quantizations (llama/11360)

bd93c1b

vulkan: Catch pipeline creation failure and print an error message (llama/11436)

d4f6b2c

vulkan: compile shaders on-demand (llama/11406)

5c008f7

Vulkan-run-test: fix mmq_wg_denoms (llama/11343)

133a580

vulkan: sort shaders for more deterministic binary (llama/11315)

d7c0046

vulkan: fix diag_mask_inf (llama/11323)

f76204e

vulkan: fix coopmat2 validation failures (llama/11284)

f2cc7e9

vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama/11281)

e0e73fa

vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (llama/11166)

3bb9e77

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (llama/11206)

ee122d3

vulkan: optimize coopmat2 q2_k dequant function (llama/11130)

d49a569

vulkan: scale caching for k quants + misc fixes (llama/11081)

03ab36f

fix: ggml: fix vulkan-shaders-gen build (llama/10448)

ad8f031

Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (llama/11161)

5ad3f1d

llama: add support for QRWKV6 model architecture (llama/11001)

4a6b7e0

Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (llama/11117)

623b74d

fix: Vulkan shader gen binary path when Cross-compiling (llama/11096)

966a7bb

Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver (llama/11074)

4d90c3d

fix: Vulkan shader gen binary path (llama/11037)

7008fb8

vulkan: optimize mul_mat for small values of N (llama/10991)

5fc8eea

vulkan: im2col and matmul optimizations for stable diffusion (llama/10942)

beef268

vulkan: Use push constant offset to handle misaligned descriptors (llama/10987)

04e729a

vulkan: multi-row k quants (llama/10846)

3bf5be1

examples, ggml : fix GCC compiler warnings (llama/10983)

d7cf559

vulkan: build fixes for 32b (llama/10927)

f1e76ce

vulkan: optimize coopmat2 dequant functions (llama/10855)

5e70c43

vulkan: bugfixes for small subgroup size systems + llvmpipe test (llama/10809)

9220b51

rwkv6: add wkv6 support for Vulkan backend (llama/10829)

c7285d6

llama : add Qwen2VL support + multimodal RoPE (llama/10361)

219d12b

vulkan: small mul_mat_vec optimizations (llama/10665)

ec98109

Vulkan: Use improved q4_k and q5_k dequant code in dequant shaders (llama/10798)

a812efc