Commits · Xenobd/whisper.cpp

vulkan: Make Vulkan optional at runtime (ggml/11493). (llama/11494)

762f497

Danny Milosavljevic

jeffbolznv commited on Feb 10, 2025

CUDA: use mma PTX instructions for FlashAttention (llama/11583)

f328957

JohannesGaessler Diego Devesa commited on Feb 2, 2025

rpc : early register backend devices (llama/11262)

4134077

rgerganov commited on Jan 17, 2025

CUDA: backwards pass for misc. ops, add tests (llama/11257)

2fbcec1

JohannesGaessler commited on Jan 16, 2025

RoPE: fix back, CUDA support for back + noncont. (llama/11240)

131a21e

JohannesGaessler commited on Jan 15, 2025

GGUF: C++ refactor, backend support, misc fixes (skip) (llama/11030)

92311a3

JohannesGaessler commited on Jan 14, 2025

llama: add support for QRWKV6 model architecture (llama/11001)

4a6b7e0

mollysama

ggerganov

compilade commited on Jan 10, 2025

GGUF: C++ refactor, backend support, misc fixes (llama/11030)

21c5b64

JohannesGaessler commited on Jan 7, 2025

tts : add OuteTTS support (llama/10784)

8d0f0ac

ggerganov commited on Dec 18, 2024

llama : add Qwen2VL support + multimodal RoPE (llama/10361)

219d12b

RzZ

ggerganov commited on Dec 14, 2024

Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (llama/10693)

83a0899

lhez Skyler Szot Shangqing Gu Alexander Angus Hongqiang Wang Max Krasnyansky commited on Dec 13, 2024

ggml: load all backends from a user-provided search path (llama/10699)

c6de218

Gilad S Diego Devesa commited on Dec 11, 2024

ggml : refactor online repacking (llama/10446)

163128e

Djip007

ggerganov commited on Dec 7, 2024

ggml : remove old files (skip) (#0)

6284570
unverified

ggerganov commited on Dec 8, 2024

ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)

154bbc0

PABannier commited on Dec 3, 2024

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)

bf73242

shupeif commited on Nov 28, 2024

ggml : add support for dynamic loading of backends (llama/10469)

b73266f

Diego Devesa

ggerganov commited on Nov 25, 2024

ggml: new optimization interface (ggml/988)

dd33ace

JohannesGaessler commited on Nov 16, 2024

backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (llama/9921)

3541ee8

Charles Xu Diego Devesa commited on Nov 15, 2024

ggml : build backends as libraries (llama/10256)

3dc93f3

Diego Devesa

ggerganov R0CKSTAR commited on Nov 14, 2024

metal : optimize FA kernels (llama/10171)

44ff932

ggerganov commited on Nov 8, 2024

Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (llama/10133)

f58e658

Zhiyuan Li

ggerganov Diego Devesa

pacominev Yuri Khrustalev Meng, Hengyu commited on Nov 7, 2024

ggml : move CPU backend to a separate file (llama/10144)

0f447f2

Diego Devesa commited on Nov 3, 2024

llama : add simple-chat example (llama/10124)

41ff26f

Diego Devesa Xuan Son Nguyen commited on Nov 1, 2024

llama : use smart pointers for ggml resources (llama/10117)

6b82135

Diego Devesa commited on Nov 1, 2024

ggml : remove ggml_scratch (llama/10121)

3f0b7ba

ggerganov commited on Nov 1, 2024

kompute: add backend registry / device interfaces (llama/10045)

b612415

slpnix commited on Oct 30, 2024

llama : refactor model loader with backend registry (llama/10026)

582a21e

Diego Devesa commited on Oct 30, 2024

ggml : add AMX backend (llama/8998)

1152a79

mingfeima commited on Oct 26, 2024

Adapt to dynamically loadable backends mechanism (llama/9970)

f8d4728

leo-pony commited on Oct 22, 2024

Add SYCL Backend registry, device and Event Interfaces (llama/9705)

f35cae5

Ouadie EL FAROUKI commited on Oct 18, 2024

add amx kernel for gemm (llama/8998)

db52137

mingfeima commited on Oct 18, 2024

vulkan : add backend registry / device interfaces (llama/9721)

df2cb6e

Diego Devesa commited on Oct 17, 2024

rpc : add backend registry / device interfaces (llama/9812)

4ac768e

Diego Devesa commited on Oct 10, 2024

ggml : fix BLAS with unsupported types (llama/9775)

0a93e1b

Diego Devesa commited on Oct 8, 2024

ggml : add backend registry / device interfaces to BLAS backend (llama/9752)

7f269bb

Diego Devesa commited on Oct 7, 2024

ggml : add metal backend registry / device (llama/9713)

b6adf19

ggerganov slaren commited on Oct 7, 2024

ggml : alloc ggml_contexts on the heap (#2525)

3ccf40a
unverified

ggerganov commited on Oct 31, 2024

ggml : fix typo in example usage ggml_gallocr_new (ggml/984)

30a097b

danbev commited on Oct 4, 2024

ggml-backend : add device and backend reg interfaces (llama/9707)

9d74d85

Diego Devesa commited on Oct 3, 2024

ggml-backend : add device and backend reg interfaces (llama/9707)

1bdb50a

Diego Devesa

JohannesGaessler commited on Oct 2, 2024

ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980)

52069b8

JohannesGaessler commited on Oct 3, 2024

ggml: refactor cross entropy loss CPU impl. (ggml/976)

2a0805f

JohannesGaessler commited on Oct 2, 2024

metal : reduce command encoding overhead (llama/9698)

43d5a06

ggerganov commited on Oct 2, 2024

test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974)

76aa810

JohannesGaessler commited on Sep 30, 2024

ggml: fix gradient allocation logic (ggml/966)

ad3f29d

JohannesGaessler commited on Sep 29, 2024

ggml : add run-time detection of neon, i8mm and sve (llama/9331)

12c0e23

Dan Johansson commited on Sep 28, 2024

ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969)

ad34655

ggerganov commited on Sep 24, 2024

log : add CONT level for continuing previous log entry (llama/9610)

a29a4c5

ggerganov commited on Sep 24, 2024

examples : adapt to ggml.h changes (ggml/0)

91c7734

ggerganov commited on Sep 20, 2024

Commit History

vulkan: Make Vulkan optional at runtime (ggml/11493). (llama/11494) 762f497

CUDA: use mma PTX instructions for FlashAttention (llama/11583) f328957

rpc : early register backend devices (llama/11262) 4134077

CUDA: backwards pass for misc. ops, add tests (llama/11257) 2fbcec1

RoPE: fix back, CUDA support for back + noncont. (llama/11240) 131a21e

GGUF: C++ refactor, backend support, misc fixes (skip) (llama/11030) 92311a3

llama: add support for QRWKV6 model architecture (llama/11001) 4a6b7e0

GGUF: C++ refactor, backend support, misc fixes (llama/11030) 21c5b64

tts : add OuteTTS support (llama/10784) 8d0f0ac

llama : add Qwen2VL support + multimodal RoPE (llama/10361) 219d12b

Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (llama/10693) 83a0899

ggml: load all backends from a user-provided search path (llama/10699) c6de218

ggml : refactor online repacking (llama/10446) 163128e

ggml : remove old files (skip) (#0) 6284570 unverified

ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034) 154bbc0

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541) bf73242

ggml : add support for dynamic loading of backends (llama/10469) b73266f

ggml: new optimization interface (ggml/988) dd33ace

backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (llama/9921) 3541ee8

ggml : build backends as libraries (llama/10256) 3dc93f3

metal : optimize FA kernels (llama/10171) 44ff932

Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (llama/10133) f58e658

ggml : move CPU backend to a separate file (llama/10144) 0f447f2

llama : add simple-chat example (llama/10124) 41ff26f

llama : use smart pointers for ggml resources (llama/10117) 6b82135

ggml : remove ggml_scratch (llama/10121) 3f0b7ba

kompute: add backend registry / device interfaces (llama/10045) b612415

llama : refactor model loader with backend registry (llama/10026) 582a21e

ggml : add AMX backend (llama/8998) 1152a79

Adapt to dynamically loadable backends mechanism (llama/9970) f8d4728

Add SYCL Backend registry, device and Event Interfaces (llama/9705) f35cae5

add amx kernel for gemm (llama/8998) db52137

vulkan : add backend registry / device interfaces (llama/9721) df2cb6e

rpc : add backend registry / device interfaces (llama/9812) 4ac768e

ggml : fix BLAS with unsupported types (llama/9775) 0a93e1b

ggml : add backend registry / device interfaces to BLAS backend (llama/9752) 7f269bb

ggml : add metal backend registry / device (llama/9713) b6adf19

ggml : alloc ggml_contexts on the heap (#2525) 3ccf40a unverified

ggml : fix typo in example usage ggml_gallocr_new (ggml/984) 30a097b

ggml-backend : add device and backend reg interfaces (llama/9707) 9d74d85

ggml-backend : add device and backend reg interfaces (llama/9707) 1bdb50a

ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980) 52069b8

ggml: refactor cross entropy loss CPU impl. (ggml/976) 2a0805f

metal : reduce command encoding overhead (llama/9698) 43d5a06

test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974) 76aa810

ggml: fix gradient allocation logic (ggml/966) ad3f29d

ggml : add run-time detection of neon, i8mm and sve (llama/9331) 12c0e23

ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969) ad34655

log : add CONT level for continuing previous log entry (llama/9610) a29a4c5

examples : adapt to ggml.h changes (ggml/0) 91c7734

vulkan: Make Vulkan optional at runtime (ggml/11493). (llama/11494)

762f497

CUDA: use mma PTX instructions for FlashAttention (llama/11583)

f328957

rpc : early register backend devices (llama/11262)

4134077

CUDA: backwards pass for misc. ops, add tests (llama/11257)

2fbcec1

RoPE: fix back, CUDA support for back + noncont. (llama/11240)

131a21e

GGUF: C++ refactor, backend support, misc fixes (skip) (llama/11030)

92311a3

llama: add support for QRWKV6 model architecture (llama/11001)

4a6b7e0

GGUF: C++ refactor, backend support, misc fixes (llama/11030)

21c5b64

tts : add OuteTTS support (llama/10784)

8d0f0ac

llama : add Qwen2VL support + multimodal RoPE (llama/10361)

219d12b

Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (llama/10693)

83a0899

ggml: load all backends from a user-provided search path (llama/10699)

c6de218

ggml : refactor online repacking (llama/10446)

163128e

ggml : remove old files (skip) (#0)

6284570
unverified

ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)

154bbc0

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)

bf73242

ggml : add support for dynamic loading of backends (llama/10469)

b73266f

ggml: new optimization interface (ggml/988)

dd33ace

backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (llama/9921)

3541ee8

ggml : build backends as libraries (llama/10256)

3dc93f3

metal : optimize FA kernels (llama/10171)

44ff932

Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (llama/10133)

f58e658

ggml : move CPU backend to a separate file (llama/10144)

0f447f2

llama : add simple-chat example (llama/10124)

41ff26f

llama : use smart pointers for ggml resources (llama/10117)

6b82135

ggml : remove ggml_scratch (llama/10121)

3f0b7ba

kompute: add backend registry / device interfaces (llama/10045)

b612415

llama : refactor model loader with backend registry (llama/10026)

582a21e

ggml : add AMX backend (llama/8998)

1152a79

Adapt to dynamically loadable backends mechanism (llama/9970)

f8d4728

Add SYCL Backend registry, device and Event Interfaces (llama/9705)

f35cae5

add amx kernel for gemm (llama/8998)

db52137

vulkan : add backend registry / device interfaces (llama/9721)

df2cb6e

rpc : add backend registry / device interfaces (llama/9812)

4ac768e

ggml : fix BLAS with unsupported types (llama/9775)

0a93e1b

ggml : add backend registry / device interfaces to BLAS backend (llama/9752)

7f269bb

ggml : add metal backend registry / device (llama/9713)

b6adf19

ggml : alloc ggml_contexts on the heap (#2525)

3ccf40a
unverified

ggml : fix typo in example usage ggml_gallocr_new (ggml/984)

30a097b

ggml-backend : add device and backend reg interfaces (llama/9707)

9d74d85

ggml-backend : add device and backend reg interfaces (llama/9707)

1bdb50a

ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980)

52069b8

ggml: refactor cross entropy loss CPU impl. (ggml/976)

2a0805f

metal : reduce command encoding overhead (llama/9698)

43d5a06

test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974)

76aa810

ggml: fix gradient allocation logic (ggml/966)

ad3f29d

ggml : add run-time detection of neon, i8mm and sve (llama/9331)

12c0e23

ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969)

ad34655

log : add CONT level for continuing previous log entry (llama/9610)

a29a4c5

examples : adapt to ggml.h changes (ggml/0)

91c7734