Commit History

cuda : optimize argmax (llama/10441)
69ae50d

Diego Devesa JohannesGaessler commited on

vulkan: predicate max operation in soft_max shaders/soft_max (llama/10437)
0a14325

jeffbolznv commited on

vulkan: copy iq4_nl LUT into shared memory (llama/10409)
c31abdb

jeffbolznv commited on

vulkan: further optimize mul_mat_vec using larger loads (llama/10387)
50a2978

jeffbolznv commited on

add cmake rvv support (llama/10411)
e0bf47c

haopeng commited on

CUDA: remove unnecessary warp reduce in FA (ggml/1032)
9a8c238

mahorozte mahorozte commited on

feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019)
c7e59ef

PABannier Diego Devesa commited on

metal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml/1026)
9c845f4

PABannier commited on

Do not include arm_neon.h when compiling CUDA code (ggml/1028)
80663f4

Frankie Robertson commited on

ggml-opt: fix data corruption (ggml/1022)
a916e92

JohannesGaessler commited on

ruby : Add low-level methods to transcribe (#2585)
4bf69ed
unverified

KitaitiMakoto commited on

models : add `q8_0` models to `download-ggml-model.sh` (#2589)
7feeb43
unverified

mikey-rrr commited on

ruby : Follow source tree change (#2580)
7895d75
unverified

KitaitiMakoto commited on

whisper : use backend registry (#0)
b9f5e40

ggerganov commited on

ggml/sched : do not skip views in pre-assignments
b1eba61

slaren commited on

whisper : adapt to new ggml (wip)
ec6f374

ggerganov commited on

talk-llama : sync llama.cpp
1568fc8

ggerganov commited on

sync : ggml
e3c317a

ggerganov commited on

ggml : sync resolve (skip) (#0)
d4d67dc

ggerganov commited on

Add required ggml-base and backend libs to cmake pkg (llama/10407)
8fdd994

bandoti commited on

cuda : fix CUDA_FLAGS not being applied (llama/10403)
22e1593

Diego Devesa commited on

sycl : Add option to set the SYCL architecture for all targets (llama/10266)
0d836df

Romain Biessy commited on

vulkan: Optimize soft_max (llama/10301)
5cb851d

jeffbolznv commited on

sycl: Revert MUL_MAT_OP support changes (llama/10385)
6df9941

Alberto Cabrera Pérez commited on

cuda : only use native when supported by cmake (llama/10389)
24d2e82

Diego Devesa commited on

vulkan: remove use of null initializer (llama/10372)
dacdc69

jeffbolznv commited on

metal : fox offset integer overflows in im2col (ggml/1015)
efbd100

pacominev commited on

Vulkan: Fix device info output format specifiers (llama/10366)
8000df9

OccamRazor commited on

metal : add `GGML_UNARY_OP_ELU` kernel (ggml/1018)
5959420

PABannier commited on

CUDA: fix MMV kernel being used for FP16 src1 (llama/10357)
af4dff1

JohannesGaessler commited on

CMake: fix typo in comment [no ci] (llama/10360)
d324d0b

JohannesGaessler commited on

llama : only use default buffer types for the KV cache (llama/10358)
9e9c0ad

Diego Devesa commited on

metal : refactor kernel args into structs (llama/10238)
15659b4

ggerganov commited on

ggml : fix undefined reference to 'getcpu' (llama/10354)
2f9b147

FirstTimeEZ commited on

CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318)
e446f60

JohannesGaessler commited on

CMake: default to -arch=native for CUDA build (llama/10320)
66edfb6

JohannesGaessler commited on

ggml : fix possible buffer use after free in sched reserve (llama/9930)
4703ea3

Diego Devesa commited on

ggml : inttypes.h -> cinttypes (llama/0)
6ba2c8f

ggerganov commited on

ggml : adapt AMX to tensor->grad removal (llama/0)
8a67e9f

ggerganov commited on

ggml : fix compile warnings (llama/0)
80d6ec0

ggerganov commited on

llamafile : fix include path (llama/0)
e443f89

ggerganov commited on

vulkan: Optimize some mat-vec mul quant shaders (llama/10296)
dc0e685

jeffbolznv commited on

ggml : optimize Q4_0 into Q4_0_X_Y repack (llama/10324)
abf6f22

Dan Johansson commited on

Make updates to fix issues with clang-cl builds while using AVX512 flags (llama/10314)
2868c2b

Srihari-mcw commited on

ggml: new optimization interface (ggml/988)
dd33ace

JohannesGaessler commited on

ggml : remove duplicated sources from the last sync (ggml/1017)
026d20b

ggerganov commited on

ggml : fix some build issues
c5ba1d1

slaren commited on

sync : leftovers (ggml/0)
0f6c498

ggerganov commited on

cmake : restore CMakeLists.txt (llama/10256)
51a70ff

ggerganov commited on

AVX BF16 and single scale quant optimizations (llama/10212)
e6ffed3

Eve commited on