Commits · Xenobd/whisper.cpp

llama/ggml: add LLM training support (llama/10544)

8d3b3c1

JohannesGaessler commited on May 12, 2025

CUDA: fix bad asserts for partial offload (llama/13337)

23e676b

JohannesGaessler commited on May 6, 2025

CUDA: fix q_nope_absorbed prec for DS 2 Lite f16 (llama/13137)

e9c9d4b

JohannesGaessler commited on Apr 28, 2025

ggml : Depthwise 2D convolution (ggml/1152)

0c950d5

Acly commited on Apr 17, 2025

ggml : add bilinear upscale support (ggml/1185)

4c5e449

Diego Devesa commited on Apr 9, 2025

ggml : add more generic custom op, remove deprecated custom ops (ggml/1183)

ba7a5f8

Diego Devesa commited on Apr 9, 2025

metal : improve FA + improve MoE (llama/12612)

04a3389

ggerganov commited on Mar 28, 2025

llama: Add support for RWKV v7 architecture (llama/12412)

727de7e

mollysama commited on Mar 17, 2025

ggml : portability fixes for VS 2017 (llama/12150)

49e3343

mgroeber9110 Marcus Groeber commited on Mar 4, 2025

cleanup: fix compile warnings associated with gnu_printf (llama/11811)

ef6a968

bandoti commited on Feb 12, 2025

CUDA: use mma PTX instructions for FlashAttention (llama/11583)

f328957

JohannesGaessler Diego Devesa commited on Feb 2, 2025

CUDA: backwards pass for misc. ops, add tests (llama/11257)

2fbcec1

JohannesGaessler commited on Jan 16, 2025

RoPE: fix back, CUDA support for back + noncont. (llama/11240)

131a21e

JohannesGaessler commited on Jan 15, 2025

llama: add support for QRWKV6 model architecture (llama/11001)

4a6b7e0

mollysama

ggerganov

compilade commited on Jan 10, 2025

GGUF: C++ refactor, backend support, misc fixes (llama/11030)

21c5b64

JohannesGaessler commited on Jan 7, 2025

tts : add OuteTTS support (llama/10784)

8d0f0ac

ggerganov commited on Dec 18, 2024

llama : add Qwen2VL support + multimodal RoPE (llama/10361)

219d12b

RzZ

ggerganov commited on Dec 14, 2024

ggml : refactor online repacking (llama/10446)

163128e

Djip007

ggerganov commited on Dec 7, 2024

ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)

154bbc0

PABannier commited on Dec 3, 2024

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)

bf73242

shupeif commited on Nov 28, 2024

ggml : add support for dynamic loading of backends (llama/10469)

b73266f

Diego Devesa

ggerganov commited on Nov 25, 2024

ggml: new optimization interface (ggml/988)

dd33ace

JohannesGaessler commited on Nov 16, 2024

ggml : build backends as libraries (llama/10256)

3dc93f3

Diego Devesa

ggerganov R0CKSTAR commited on Nov 14, 2024

metal : optimize FA kernels (llama/10171)

44ff932

ggerganov commited on Nov 8, 2024

Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (llama/10133)

f58e658

Zhiyuan Li

ggerganov Diego Devesa

pacominev Yuri Khrustalev Meng, Hengyu commited on Nov 7, 2024

ggml : move CPU backend to a separate file (llama/10144)

0f447f2

Diego Devesa commited on Nov 3, 2024

llama : add simple-chat example (llama/10124)

41ff26f

Diego Devesa Xuan Son Nguyen commited on Nov 1, 2024

ggml : remove ggml_scratch (llama/10121)

3f0b7ba

ggerganov commited on Nov 1, 2024

add amx kernel for gemm (llama/8998)

db52137

mingfeima commited on Oct 18, 2024

ggml : fix BLAS with unsupported types (llama/9775)

0a93e1b

Diego Devesa commited on Oct 8, 2024

ggml : alloc ggml_contexts on the heap (#2525)

3ccf40a
unverified

ggerganov commited on Oct 31, 2024

ggml-backend : add device and backend reg interfaces (llama/9707)

9d74d85

Diego Devesa commited on Oct 3, 2024

ggml-backend : add device and backend reg interfaces (llama/9707)

1bdb50a

Diego Devesa

JohannesGaessler commited on Oct 2, 2024

ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980)

52069b8

JohannesGaessler commited on Oct 3, 2024

test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974)

76aa810

JohannesGaessler commited on Sep 30, 2024

ggml: fix gradient allocation logic (ggml/966)

ad3f29d

JohannesGaessler commited on Sep 29, 2024

ggml : add run-time detection of neon, i8mm and sve (llama/9331)

12c0e23

Dan Johansson commited on Sep 28, 2024

ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969)

ad34655

ggerganov commited on Sep 24, 2024

log : add CONT level for continuing previous log entry (llama/9610)

a29a4c5

ggerganov commited on Sep 24, 2024

examples : adapt to ggml.h changes (ggml/0)

91c7734

ggerganov commited on Sep 20, 2024

ggml : refactoring (llama/#0)

1b62c96

ggerganov commited on Sep 20, 2024

common : reimplement logging (llama/9418)

e893c97

ggerganov commited on Sep 15, 2024

riscv : modify Makefile and add a RISCV_VECT to print log info (llama/9442)

f77ad34

Ahmad Tameem commited on Sep 12, 2024

ggml/examples: add backend support for numerical optimization (ggml/949)

5c178b0

JohannesGaessler

ggerganov slaren commited on Sep 20, 2024

ggml-quants : ternary packing for TriLMs and BitNet b1.58 (llama/8151)

d1c244a

compilade commited on Sep 6, 2024

llama : support RWKV v6 models (llama/8980)

bd4f5ec

mollysama Layl Bongers

compilade

ggerganov commited on Sep 1, 2024

Threadpool: take 2 (llama/8672)

e3e9ca4

Faisal Zaghloul Max Krasnyansky

quic-fzaghlou Max Krasnyansky slaren commited on Aug 29, 2024

tests: add gradient tests for all backends (ggml/932)

4751b2f

JohannesGaessler commited on Sep 3, 2024

ggml: fix ggml_graph_cpy undefined behavior (ggml/943)

9202e70

JohannesGaessler commited on Aug 31, 2024

CPU/CUDA: Gemma 2 FlashAttention support (llama/8542)

fb8ae8b

JohannesGaessler commited on Aug 24, 2024

Commit History

llama/ggml: add LLM training support (llama/10544) 8d3b3c1

CUDA: fix bad asserts for partial offload (llama/13337) 23e676b

CUDA: fix q_nope_absorbed prec for DS 2 Lite f16 (llama/13137) e9c9d4b

ggml : Depthwise 2D convolution (ggml/1152) 0c950d5

ggml : add bilinear upscale support (ggml/1185) 4c5e449

ggml : add more generic custom op, remove deprecated custom ops (ggml/1183) ba7a5f8

metal : improve FA + improve MoE (llama/12612) 04a3389

llama: Add support for RWKV v7 architecture (llama/12412) 727de7e

ggml : portability fixes for VS 2017 (llama/12150) 49e3343

cleanup: fix compile warnings associated with gnu_printf (llama/11811) ef6a968

CUDA: use mma PTX instructions for FlashAttention (llama/11583) f328957

CUDA: backwards pass for misc. ops, add tests (llama/11257) 2fbcec1

RoPE: fix back, CUDA support for back + noncont. (llama/11240) 131a21e

llama: add support for QRWKV6 model architecture (llama/11001) 4a6b7e0

GGUF: C++ refactor, backend support, misc fixes (llama/11030) 21c5b64

tts : add OuteTTS support (llama/10784) 8d0f0ac

llama : add Qwen2VL support + multimodal RoPE (llama/10361) 219d12b

ggml : refactor online repacking (llama/10446) 163128e

ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034) 154bbc0

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541) bf73242

ggml : add support for dynamic loading of backends (llama/10469) b73266f

ggml: new optimization interface (ggml/988) dd33ace

ggml : build backends as libraries (llama/10256) 3dc93f3

metal : optimize FA kernels (llama/10171) 44ff932

Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (llama/10133) f58e658

ggml : move CPU backend to a separate file (llama/10144) 0f447f2

llama : add simple-chat example (llama/10124) 41ff26f

ggml : remove ggml_scratch (llama/10121) 3f0b7ba

add amx kernel for gemm (llama/8998) db52137

ggml : fix BLAS with unsupported types (llama/9775) 0a93e1b

ggml : alloc ggml_contexts on the heap (#2525) 3ccf40a unverified

ggml-backend : add device and backend reg interfaces (llama/9707) 9d74d85

ggml-backend : add device and backend reg interfaces (llama/9707) 1bdb50a

ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980) 52069b8

test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974) 76aa810

ggml: fix gradient allocation logic (ggml/966) ad3f29d

ggml : add run-time detection of neon, i8mm and sve (llama/9331) 12c0e23

ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969) ad34655

log : add CONT level for continuing previous log entry (llama/9610) a29a4c5

examples : adapt to ggml.h changes (ggml/0) 91c7734

ggml : refactoring (llama/#0) 1b62c96

common : reimplement logging (llama/9418) e893c97

riscv : modify Makefile and add a RISCV_VECT to print log info (llama/9442) f77ad34

ggml/examples: add backend support for numerical optimization (ggml/949) 5c178b0

ggml-quants : ternary packing for TriLMs and BitNet b1.58 (llama/8151) d1c244a

llama : support RWKV v6 models (llama/8980) bd4f5ec

Threadpool: take 2 (llama/8672) e3e9ca4

tests: add gradient tests for all backends (ggml/932) 4751b2f

ggml: fix ggml_graph_cpy undefined behavior (ggml/943) 9202e70

CPU/CUDA: Gemma 2 FlashAttention support (llama/8542) fb8ae8b

llama/ggml: add LLM training support (llama/10544)

8d3b3c1

CUDA: fix bad asserts for partial offload (llama/13337)

23e676b

CUDA: fix q_nope_absorbed prec for DS 2 Lite f16 (llama/13137)

e9c9d4b

ggml : Depthwise 2D convolution (ggml/1152)

0c950d5

ggml : add bilinear upscale support (ggml/1185)

4c5e449

ggml : add more generic custom op, remove deprecated custom ops (ggml/1183)

ba7a5f8

metal : improve FA + improve MoE (llama/12612)

04a3389

llama: Add support for RWKV v7 architecture (llama/12412)

727de7e

ggml : portability fixes for VS 2017 (llama/12150)

49e3343

cleanup: fix compile warnings associated with gnu_printf (llama/11811)

ef6a968

CUDA: use mma PTX instructions for FlashAttention (llama/11583)

f328957

CUDA: backwards pass for misc. ops, add tests (llama/11257)

2fbcec1

RoPE: fix back, CUDA support for back + noncont. (llama/11240)

131a21e

llama: add support for QRWKV6 model architecture (llama/11001)

4a6b7e0

GGUF: C++ refactor, backend support, misc fixes (llama/11030)

21c5b64

tts : add OuteTTS support (llama/10784)

8d0f0ac

llama : add Qwen2VL support + multimodal RoPE (llama/10361)

219d12b

ggml : refactor online repacking (llama/10446)

163128e

ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)

154bbc0

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)

bf73242

ggml : add support for dynamic loading of backends (llama/10469)

b73266f

ggml: new optimization interface (ggml/988)

dd33ace

ggml : build backends as libraries (llama/10256)

3dc93f3

metal : optimize FA kernels (llama/10171)

44ff932

Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (llama/10133)

f58e658

ggml : move CPU backend to a separate file (llama/10144)

0f447f2

llama : add simple-chat example (llama/10124)

41ff26f

ggml : remove ggml_scratch (llama/10121)

3f0b7ba

add amx kernel for gemm (llama/8998)

db52137

ggml : fix BLAS with unsupported types (llama/9775)

0a93e1b

ggml : alloc ggml_contexts on the heap (#2525)

3ccf40a
unverified

ggml-backend : add device and backend reg interfaces (llama/9707)

9d74d85

ggml-backend : add device and backend reg interfaces (llama/9707)

1bdb50a

ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980)

52069b8

test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974)

76aa810

ggml: fix gradient allocation logic (ggml/966)

ad3f29d

ggml : add run-time detection of neon, i8mm and sve (llama/9331)

12c0e23

ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969)

ad34655

log : add CONT level for continuing previous log entry (llama/9610)

a29a4c5

examples : adapt to ggml.h changes (ggml/0)

91c7734

ggml : refactoring (llama/#0)

1b62c96

common : reimplement logging (llama/9418)

e893c97

riscv : modify Makefile and add a RISCV_VECT to print log info (llama/9442)

f77ad34

ggml/examples: add backend support for numerical optimization (ggml/949)

5c178b0

ggml-quants : ternary packing for TriLMs and BitNet b1.58 (llama/8151)

d1c244a

llama : support RWKV v6 models (llama/8980)

bd4f5ec

Threadpool: take 2 (llama/8672)

e3e9ca4

tests: add gradient tests for all backends (ggml/932)

4751b2f

ggml: fix ggml_graph_cpy undefined behavior (ggml/943)

9202e70

CPU/CUDA: Gemma 2 FlashAttention support (llama/8542)

fb8ae8b