Commits · natasa365/whisper.cpp

cuda : optimize argmax (llama/10441)

69ae50d

Diego Devesa

JohannesGaessler commited on Nov 21, 2024

vulkan: predicate max operation in soft_max shaders/soft_max (llama/10437)

0a14325

jeffbolznv commited on Nov 20, 2024

vulkan: copy iq4_nl LUT into shared memory (llama/10409)

c31abdb

jeffbolznv commited on Nov 20, 2024

vulkan: further optimize mul_mat_vec using larger loads (llama/10387)

50a2978

jeffbolznv commited on Nov 20, 2024

add cmake rvv support (llama/10411)

e0bf47c

haopeng commited on Nov 19, 2024

CUDA: remove unnecessary warp reduce in FA (ggml/1032)

9a8c238

mahorozte

mahorozte commited on Dec 3, 2024

feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019)

c7e59ef

PABannier Diego Devesa commited on Dec 2, 2024

metal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml/1026)

9c845f4

PABannier commited on Nov 28, 2024

Do not include arm_neon.h when compiling CUDA code (ggml/1028)

80663f4

Frankie Robertson commited on Nov 26, 2024

ggml-opt: fix data corruption (ggml/1022)

a916e92

JohannesGaessler commited on Nov 20, 2024

ruby : Add low-level methods to transcribe (#2585)

4bf69ed
unverified

KitaitiMakoto commited on Nov 28, 2024

models : add `q8_0` models to `download-ggml-model.sh` (#2589)

7feeb43
unverified

mikey-rrr commited on Nov 28, 2024

ruby : Follow source tree change (#2580)

7895d75
unverified

KitaitiMakoto commited on Nov 21, 2024

whisper : use backend registry (#0)

b9f5e40

ggerganov commited on Nov 20, 2024

ggml/sched : do not skip views in pre-assignments

b1eba61

slaren commited on Nov 20, 2024

whisper : adapt to new ggml (wip)

ec6f374

ggerganov commited on Nov 19, 2024

talk-llama : sync llama.cpp

1568fc8

ggerganov commited on Nov 19, 2024

sync : ggml

e3c317a

ggerganov commited on Nov 19, 2024

ggml : sync resolve (skip) (#0)

d4d67dc

ggerganov commited on Nov 19, 2024

Add required ggml-base and backend libs to cmake pkg (llama/10407)

8fdd994

bandoti commited on Nov 19, 2024

cuda : fix CUDA_FLAGS not being applied (llama/10403)

22e1593

Diego Devesa commited on Nov 19, 2024

sycl : Add option to set the SYCL architecture for all targets (llama/10266)

0d836df

Romain Biessy commited on Nov 19, 2024

vulkan: Optimize soft_max (llama/10301)

5cb851d

jeffbolznv commited on Nov 19, 2024

sycl: Revert MUL_MAT_OP support changes (llama/10385)

6df9941

Alberto Cabrera Pérez commited on Nov 19, 2024

cuda : only use native when supported by cmake (llama/10389)

24d2e82

Diego Devesa commited on Nov 18, 2024

vulkan: remove use of null initializer (llama/10372)

dacdc69

jeffbolznv commited on Nov 18, 2024

metal : fox offset integer overflows in im2col (ggml/1015)

efbd100

pacominev commited on Nov 18, 2024

Vulkan: Fix device info output format specifiers (llama/10366)

8000df9

OccamRazor commited on Nov 18, 2024

metal : add `GGML_UNARY_OP_ELU` kernel (ggml/1018)

5959420

PABannier commited on Nov 18, 2024

CUDA: fix MMV kernel being used for FP16 src1 (llama/10357)

af4dff1

JohannesGaessler commited on Nov 17, 2024

CMake: fix typo in comment [no ci] (llama/10360)

d324d0b

JohannesGaessler commited on Nov 17, 2024

llama : only use default buffer types for the KV cache (llama/10358)

9e9c0ad

Diego Devesa commited on Nov 17, 2024

metal : refactor kernel args into structs (llama/10238)

15659b4

ggerganov commited on Nov 17, 2024

ggml : fix undefined reference to 'getcpu' (llama/10354)

2f9b147

FirstTimeEZ commited on Nov 17, 2024

CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318)

e446f60

JohannesGaessler commited on Nov 17, 2024

CMake: default to -arch=native for CUDA build (llama/10320)

66edfb6

JohannesGaessler commited on Nov 17, 2024

ggml : fix possible buffer use after free in sched reserve (llama/9930)

4703ea3

Diego Devesa commited on Nov 17, 2024

ggml : inttypes.h -> cinttypes (llama/0)

6ba2c8f

ggerganov commited on Nov 16, 2024

ggml : adapt AMX to tensor->grad removal (llama/0)

8a67e9f

ggerganov commited on Nov 16, 2024

ggml : fix compile warnings (llama/0)

80d6ec0

ggerganov commited on Nov 16, 2024

llamafile : fix include path (llama/0)

e443f89

ggerganov commited on Nov 16, 2024

vulkan: Optimize some mat-vec mul quant shaders (llama/10296)

dc0e685

jeffbolznv commited on Nov 16, 2024

ggml : optimize Q4_0 into Q4_0_X_Y repack (llama/10324)

abf6f22

Dan Johansson commited on Nov 16, 2024

Make updates to fix issues with clang-cl builds while using AVX512 flags (llama/10314)

2868c2b

Srihari-mcw commited on Nov 15, 2024

ggml: new optimization interface (ggml/988)

dd33ace

JohannesGaessler commited on Nov 16, 2024

ggml : remove duplicated sources from the last sync (ggml/1017)

026d20b

ggerganov commited on Nov 15, 2024

ggml : fix some build issues

c5ba1d1

slaren commited on Nov 15, 2024

sync : leftovers (ggml/0)

0f6c498

ggerganov commited on Nov 15, 2024

cmake : restore CMakeLists.txt (llama/10256)

51a70ff

ggerganov commited on Nov 15, 2024

AVX BF16 and single scale quant optimizations (llama/10212)

e6ffed3

Eve commited on Nov 15, 2024

Commit History

cuda : optimize argmax (llama/10441) 69ae50d

vulkan: predicate max operation in soft_max shaders/soft_max (llama/10437) 0a14325

vulkan: copy iq4_nl LUT into shared memory (llama/10409) c31abdb

vulkan: further optimize mul_mat_vec using larger loads (llama/10387) 50a2978

add cmake rvv support (llama/10411) e0bf47c

CUDA: remove unnecessary warp reduce in FA (ggml/1032) 9a8c238

feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019) c7e59ef

metal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml/1026) 9c845f4

Do not include arm_neon.h when compiling CUDA code (ggml/1028) 80663f4

ggml-opt: fix data corruption (ggml/1022) a916e92

ruby : Add low-level methods to transcribe (#2585) 4bf69ed unverified

models : add `q8_0` models to `download-ggml-model.sh` (#2589) 7feeb43 unverified

ruby : Follow source tree change (#2580) 7895d75 unverified

whisper : use backend registry (#0) b9f5e40

ggml/sched : do not skip views in pre-assignments b1eba61

whisper : adapt to new ggml (wip) ec6f374

talk-llama : sync llama.cpp 1568fc8

sync : ggml e3c317a

ggml : sync resolve (skip) (#0) d4d67dc

Add required ggml-base and backend libs to cmake pkg (llama/10407) 8fdd994

cuda : fix CUDA_FLAGS not being applied (llama/10403) 22e1593

sycl : Add option to set the SYCL architecture for all targets (llama/10266) 0d836df

vulkan: Optimize soft_max (llama/10301) 5cb851d

sycl: Revert MUL_MAT_OP support changes (llama/10385) 6df9941

cuda : only use native when supported by cmake (llama/10389) 24d2e82

vulkan: remove use of null initializer (llama/10372) dacdc69

metal : fox offset integer overflows in im2col (ggml/1015) efbd100

Vulkan: Fix device info output format specifiers (llama/10366) 8000df9

metal : add `GGML_UNARY_OP_ELU` kernel (ggml/1018) 5959420

CUDA: fix MMV kernel being used for FP16 src1 (llama/10357) af4dff1

CMake: fix typo in comment [no ci] (llama/10360) d324d0b

llama : only use default buffer types for the KV cache (llama/10358) 9e9c0ad

metal : refactor kernel args into structs (llama/10238) 15659b4

ggml : fix undefined reference to 'getcpu' (llama/10354) 2f9b147

CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318) e446f60

CMake: default to -arch=native for CUDA build (llama/10320) 66edfb6

ggml : fix possible buffer use after free in sched reserve (llama/9930) 4703ea3

ggml : inttypes.h -> cinttypes (llama/0) 6ba2c8f

ggml : adapt AMX to tensor->grad removal (llama/0) 8a67e9f

ggml : fix compile warnings (llama/0) 80d6ec0

llamafile : fix include path (llama/0) e443f89

vulkan: Optimize some mat-vec mul quant shaders (llama/10296) dc0e685

ggml : optimize Q4_0 into Q4_0_X_Y repack (llama/10324) abf6f22

Make updates to fix issues with clang-cl builds while using AVX512 flags (llama/10314) 2868c2b

ggml: new optimization interface (ggml/988) dd33ace

ggml : remove duplicated sources from the last sync (ggml/1017) 026d20b

ggml : fix some build issues c5ba1d1

sync : leftovers (ggml/0) 0f6c498

cmake : restore CMakeLists.txt (llama/10256) 51a70ff

AVX BF16 and single scale quant optimizations (llama/10212) e6ffed3

cuda : optimize argmax (llama/10441)

69ae50d

vulkan: predicate max operation in soft_max shaders/soft_max (llama/10437)

0a14325

vulkan: copy iq4_nl LUT into shared memory (llama/10409)

c31abdb

vulkan: further optimize mul_mat_vec using larger loads (llama/10387)

50a2978

add cmake rvv support (llama/10411)

e0bf47c

CUDA: remove unnecessary warp reduce in FA (ggml/1032)

9a8c238

feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019)

c7e59ef

metal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml/1026)

9c845f4

Do not include arm_neon.h when compiling CUDA code (ggml/1028)

80663f4

ggml-opt: fix data corruption (ggml/1022)

a916e92

ruby : Add low-level methods to transcribe (#2585)

4bf69ed
unverified

models : add `q8_0` models to `download-ggml-model.sh` (#2589)

7feeb43
unverified

ruby : Follow source tree change (#2580)

7895d75
unverified

whisper : use backend registry (#0)

b9f5e40

ggml/sched : do not skip views in pre-assignments

b1eba61

whisper : adapt to new ggml (wip)

ec6f374

talk-llama : sync llama.cpp

1568fc8

sync : ggml

e3c317a

ggml : sync resolve (skip) (#0)

d4d67dc

Add required ggml-base and backend libs to cmake pkg (llama/10407)

8fdd994

cuda : fix CUDA_FLAGS not being applied (llama/10403)

22e1593

sycl : Add option to set the SYCL architecture for all targets (llama/10266)

0d836df

vulkan: Optimize soft_max (llama/10301)

5cb851d

sycl: Revert MUL_MAT_OP support changes (llama/10385)

6df9941

cuda : only use native when supported by cmake (llama/10389)

24d2e82

vulkan: remove use of null initializer (llama/10372)

dacdc69

metal : fox offset integer overflows in im2col (ggml/1015)

efbd100

Vulkan: Fix device info output format specifiers (llama/10366)

8000df9

metal : add `GGML_UNARY_OP_ELU` kernel (ggml/1018)

5959420

CUDA: fix MMV kernel being used for FP16 src1 (llama/10357)

af4dff1

CMake: fix typo in comment [no ci] (llama/10360)

d324d0b

llama : only use default buffer types for the KV cache (llama/10358)

9e9c0ad

metal : refactor kernel args into structs (llama/10238)

15659b4

ggml : fix undefined reference to 'getcpu' (llama/10354)

2f9b147

CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318)

e446f60

CMake: default to -arch=native for CUDA build (llama/10320)

66edfb6

ggml : fix possible buffer use after free in sched reserve (llama/9930)

4703ea3

ggml : inttypes.h -> cinttypes (llama/0)

6ba2c8f

ggml : adapt AMX to tensor->grad removal (llama/0)

8a67e9f

ggml : fix compile warnings (llama/0)

80d6ec0

llamafile : fix include path (llama/0)

e443f89

vulkan: Optimize some mat-vec mul quant shaders (llama/10296)

dc0e685

ggml : optimize Q4_0 into Q4_0_X_Y repack (llama/10324)

abf6f22

Make updates to fix issues with clang-cl builds while using AVX512 flags (llama/10314)

2868c2b

ggml: new optimization interface (ggml/988)

dd33ace

ggml : remove duplicated sources from the last sync (ggml/1017)

026d20b

ggml : fix some build issues

c5ba1d1

sync : leftovers (ggml/0)

0f6c498

cmake : restore CMakeLists.txt (llama/10256)

51a70ff

AVX BF16 and single scale quant optimizations (llama/10212)

e6ffed3