Commits · Xenobd/whisper.cpp

metal : optimize MoE for large batches (llama/13388)

d51c0d3

ggerganov commited on May 13

llama/ggml: add LLM training support (llama/10544)

8d3b3c1

JohannesGaessler commited on May 12

CUDA: fix bad asserts for partial offload (llama/13337)

23e676b

JohannesGaessler commited on May 6

ggml: move fp16/bf16 conversion optimizations to CPU backend + export conversion APIs (llama/13107)

c47823e

sxx-404 commited on Apr 26

ggml : fix trailing whitespaces (llama/0)

5d27bbf

ggerganov commited on Apr 24

ggml : Depthwise 2D convolution (ggml/1152)

0c950d5

Acly commited on Apr 17

ggml : add bilinear upscale support (ggml/1185)

4c5e449

Diego Devesa commited on Apr 9

ggml : add more generic custom op, remove deprecated custom ops (ggml/1183)

ba7a5f8

Diego Devesa commited on Apr 9

llama : add option to override model tensor buffers (llama/11397)

3d000b6

Diego Devesa commited on Apr 2

metal : improve FA + improve MoE (llama/12612)

04a3389

ggerganov commited on Mar 28

llama: Add support for RWKV v7 architecture (llama/12412)

727de7e

mollysama commited on Mar 17

ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/1118)

c9a49f9

vmobilis commited on Mar 7

ggml : portability fixes for VS 2017 (llama/12150)

49e3343

mgroeber9110 Marcus Groeber commited on Mar 4

ggml-cpu: Support s390x SIMD Instruction Set (llama/12019)

4aa54ec

Aaron Teo Jinyang He junchao-zhao commited on Feb 22

fix: typos in documentation files (llama/11791)

5c6d350

Maxim Evtush commited on Feb 10

CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380)

855a9fe

JohannesGaessler commited on Jan 24

CUDA: backwards pass for misc. ops, add tests (llama/11257)

2fbcec1

JohannesGaessler commited on Jan 16

RoPE: fix back, CUDA support for back + noncont. (llama/11240)

131a21e

JohannesGaessler commited on Jan 15

ggml : add option to not print stack on abort (ggml/1081)

9b2706e

William Tambellini Diego Devesa commited on Jan 23

llama: add support for QRWKV6 model architecture (llama/11001)

4a6b7e0

mollysama

ggerganov

compilade commited on Jan 10

GGUF: C++ refactor, backend support, misc fixes (llama/11030)

21c5b64

JohannesGaessler commited on Jan 7

tts : add OuteTTS support (llama/10784)

8d0f0ac

ggerganov commited on Dec 18, 2024

tests: add tests for GGUF (llama/10830)

e7722cb

JohannesGaessler commited on Dec 17, 2024

llama : add Qwen2VL support + multimodal RoPE (llama/10361)

219d12b

RzZ

ggerganov commited on Dec 14, 2024

ggml : add check for grad_accs (ggml/1046)

eacc95c

danbev commited on Dec 13, 2024

ggml : refactor online repacking (llama/10446)

163128e

Djip007

ggerganov commited on Dec 7, 2024

ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)

154bbc0

PABannier commited on Dec 3, 2024

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)

bf73242

shupeif commited on Nov 28, 2024

ggml : add support for dynamic loading of backends (llama/10469)

b73266f

Diego Devesa

ggerganov commited on Nov 25, 2024

cuda : optimize argmax (llama/10441)

69ae50d

Diego Devesa

JohannesGaessler commited on Nov 21, 2024

ggml-opt: fix data corruption (ggml/1022)

a916e92

JohannesGaessler commited on Nov 20, 2024

ggml : fix compile warnings (llama/0)

80d6ec0

ggerganov commited on Nov 16, 2024

ggml: new optimization interface (ggml/988)

dd33ace

JohannesGaessler commited on Nov 16, 2024

ggml : fix some build issues

c5ba1d1

slaren commited on Nov 15, 2024

ggml : build backends as libraries (llama/10256)

3dc93f3

Diego Devesa

ggerganov R0CKSTAR commited on Nov 14, 2024

metal : optimize FA kernels (llama/10171)

44ff932

ggerganov commited on Nov 8, 2024

Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (llama/10133)

f58e658

Zhiyuan Li

ggerganov Diego Devesa

pacominev Yuri Khrustalev Meng, Hengyu commited on Nov 7, 2024

ggml : adjust is_first_call init value (llama/10193)

7e2b09b

ggerganov commited on Nov 6, 2024

ggml : fix arch check in bf16_to_fp32 (llama/10164)

09e4a9b

Diego Devesa commited on Nov 4, 2024

ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (llama/10167)

ba20d5c

Diego Devesa commited on Nov 4, 2024

ggml : move CPU backend to a separate file (llama/10144)

0f447f2

Diego Devesa commited on Nov 3, 2024

ggml : remove ggml_scratch (llama/10121)

3f0b7ba

ggerganov commited on Nov 1, 2024

llama : fix buffer checks for mamba and rwk (llama/10111)

9df9767

Diego Devesa commited on Oct 31, 2024

ggml : check tensor name lengths in gguf files (llama/10100)

0b78224

Diego Devesa commited on Oct 31, 2024

ggml : fix memory leaks when loading invalid gguf files (llama/10094)

f9baffc

Diego Devesa commited on Oct 30, 2024

llama : refactor model loader with backend registry (llama/10026)

582a21e

Diego Devesa commited on Oct 30, 2024

CUDA: fix MMQ for non-contiguous src0, add tests (llama/10021)

bcbaad3

JohannesGaessler commited on Oct 24, 2024

ggml : add asserts for type conversion in fattn kernels (llama/9971)

9542e42

ggerganov commited on Oct 21, 2024

add amx kernel for gemm (llama/8998)

db52137

mingfeima commited on Oct 18, 2024

fix: use `vm_allocate` to allocate CPU backend buffer on macOS (llama/9875)

cf75979

Gilad S commited on Oct 16, 2024

Commit History

metal : optimize MoE for large batches (llama/13388) d51c0d3

llama/ggml: add LLM training support (llama/10544) 8d3b3c1

CUDA: fix bad asserts for partial offload (llama/13337) 23e676b

ggml: move fp16/bf16 conversion optimizations to CPU backend + export conversion APIs (llama/13107) c47823e

ggml : fix trailing whitespaces (llama/0) 5d27bbf

ggml : Depthwise 2D convolution (ggml/1152) 0c950d5

ggml : add bilinear upscale support (ggml/1185) 4c5e449

ggml : add more generic custom op, remove deprecated custom ops (ggml/1183) ba7a5f8

llama : add option to override model tensor buffers (llama/11397) 3d000b6

metal : improve FA + improve MoE (llama/12612) 04a3389

llama: Add support for RWKV v7 architecture (llama/12412) 727de7e

ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/1118) c9a49f9

ggml : portability fixes for VS 2017 (llama/12150) 49e3343

ggml-cpu: Support s390x SIMD Instruction Set (llama/12019) 4aa54ec

fix: typos in documentation files (llama/11791) 5c6d350

CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380) 855a9fe

CUDA: backwards pass for misc. ops, add tests (llama/11257) 2fbcec1

RoPE: fix back, CUDA support for back + noncont. (llama/11240) 131a21e

ggml : add option to not print stack on abort (ggml/1081) 9b2706e

llama: add support for QRWKV6 model architecture (llama/11001) 4a6b7e0

GGUF: C++ refactor, backend support, misc fixes (llama/11030) 21c5b64

tts : add OuteTTS support (llama/10784) 8d0f0ac

tests: add tests for GGUF (llama/10830) e7722cb

llama : add Qwen2VL support + multimodal RoPE (llama/10361) 219d12b

ggml : add check for grad_accs (ggml/1046) eacc95c

ggml : refactor online repacking (llama/10446) 163128e

ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034) 154bbc0

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541) bf73242

ggml : add support for dynamic loading of backends (llama/10469) b73266f

cuda : optimize argmax (llama/10441) 69ae50d

ggml-opt: fix data corruption (ggml/1022) a916e92

ggml : fix compile warnings (llama/0) 80d6ec0

ggml: new optimization interface (ggml/988) dd33ace

ggml : fix some build issues c5ba1d1

ggml : build backends as libraries (llama/10256) 3dc93f3

metal : optimize FA kernels (llama/10171) 44ff932

Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (llama/10133) f58e658

ggml : adjust is_first_call init value (llama/10193) 7e2b09b

ggml : fix arch check in bf16_to_fp32 (llama/10164) 09e4a9b

ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (llama/10167) ba20d5c

ggml : move CPU backend to a separate file (llama/10144) 0f447f2

ggml : remove ggml_scratch (llama/10121) 3f0b7ba

llama : fix buffer checks for mamba and rwk (llama/10111) 9df9767

ggml : check tensor name lengths in gguf files (llama/10100) 0b78224

ggml : fix memory leaks when loading invalid gguf files (llama/10094) f9baffc

llama : refactor model loader with backend registry (llama/10026) 582a21e

CUDA: fix MMQ for non-contiguous src0, add tests (llama/10021) bcbaad3

ggml : add asserts for type conversion in fattn kernels (llama/9971) 9542e42

add amx kernel for gemm (llama/8998) db52137

fix: use `vm_allocate` to allocate CPU backend buffer on macOS (llama/9875) cf75979

metal : optimize MoE for large batches (llama/13388)

d51c0d3

llama/ggml: add LLM training support (llama/10544)

8d3b3c1

CUDA: fix bad asserts for partial offload (llama/13337)

23e676b

ggml: move fp16/bf16 conversion optimizations to CPU backend + export conversion APIs (llama/13107)

c47823e

ggml : fix trailing whitespaces (llama/0)

5d27bbf

ggml : Depthwise 2D convolution (ggml/1152)

0c950d5

ggml : add bilinear upscale support (ggml/1185)

4c5e449

ggml : add more generic custom op, remove deprecated custom ops (ggml/1183)

ba7a5f8

llama : add option to override model tensor buffers (llama/11397)

3d000b6

metal : improve FA + improve MoE (llama/12612)

04a3389

llama: Add support for RWKV v7 architecture (llama/12412)

727de7e

ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/1118)

c9a49f9

ggml : portability fixes for VS 2017 (llama/12150)

49e3343

ggml-cpu: Support s390x SIMD Instruction Set (llama/12019)

4aa54ec

fix: typos in documentation files (llama/11791)

5c6d350

CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380)

855a9fe

CUDA: backwards pass for misc. ops, add tests (llama/11257)

2fbcec1

RoPE: fix back, CUDA support for back + noncont. (llama/11240)

131a21e

ggml : add option to not print stack on abort (ggml/1081)

9b2706e

llama: add support for QRWKV6 model architecture (llama/11001)

4a6b7e0

GGUF: C++ refactor, backend support, misc fixes (llama/11030)

21c5b64

tts : add OuteTTS support (llama/10784)

8d0f0ac

tests: add tests for GGUF (llama/10830)

e7722cb

llama : add Qwen2VL support + multimodal RoPE (llama/10361)

219d12b

ggml : add check for grad_accs (ggml/1046)

eacc95c

ggml : refactor online repacking (llama/10446)

163128e

ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)

154bbc0

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)

bf73242

ggml : add support for dynamic loading of backends (llama/10469)

b73266f

cuda : optimize argmax (llama/10441)

69ae50d

ggml-opt: fix data corruption (ggml/1022)

a916e92

ggml : fix compile warnings (llama/0)

80d6ec0

ggml: new optimization interface (ggml/988)

dd33ace

ggml : fix some build issues

c5ba1d1

ggml : build backends as libraries (llama/10256)

3dc93f3

metal : optimize FA kernels (llama/10171)

44ff932

Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (llama/10133)

f58e658

ggml : adjust is_first_call init value (llama/10193)

7e2b09b

ggml : fix arch check in bf16_to_fp32 (llama/10164)

09e4a9b

ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (llama/10167)

ba20d5c

ggml : move CPU backend to a separate file (llama/10144)

0f447f2

ggml : remove ggml_scratch (llama/10121)

3f0b7ba

llama : fix buffer checks for mamba and rwk (llama/10111)

9df9767

ggml : check tensor name lengths in gguf files (llama/10100)

0b78224

ggml : fix memory leaks when loading invalid gguf files (llama/10094)

f9baffc

llama : refactor model loader with backend registry (llama/10026)

582a21e

CUDA: fix MMQ for non-contiguous src0, add tests (llama/10021)

bcbaad3

ggml : add asserts for type conversion in fattn kernels (llama/9971)

9542e42

add amx kernel for gemm (llama/8998)

db52137

fix: use `vm_allocate` to allocate CPU backend buffer on macOS (llama/9875)

cf75979