Commits · natasa365/whisper.cpp

ggml : fix fallback to CPU for ununsupported ops (llama/15118)

2b7ae5e

Diego Devesa commited on Aug 6

sched : fix multiple evaluations of the same graph with pipeline parallelism (llama/14855)

e9f5612

Diego Devesa commited on Jul 25

metal : fuse add, mul + add tests (llama/14596)

66ae493

ggerganov commited on Jul 18

vulkan: Add fusion support for RMS_NORM+MUL (llama/14366)

737f12d

jeffbolznv slaren commited on Jun 29

sched : avoid changing cur_copy when a graph is already allocated (llama/13922)

1c0a5c0

Diego Devesa commited on May 30

ggml : allow CUDA graphs when using pipeline parallelism (llama/13814)

b85e3c0

Diego Devesa commited on May 27

llama/ggml: add LLM training support (llama/10544)

8d3b3c1

JohannesGaessler commited on May 12

Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (llama/13386)

418769d

David Huang commited on May 11

CUDA: fix logic for clearing padding with -ngl 0 (llama/13320)

c3e51a2

JohannesGaessler commited on May 5

ggml : portability fixes for VS 2017 (llama/12150)

49e3343

mgroeber9110 Marcus Groeber commited on Mar 4

ggml : upgrade init_tensor API to return a ggml_status (llama/11854)

d6b6852

William Tambellini slaren commited on Feb 28

ggml-backend : only offload from host buffers (fix) (llama/11124)

9ac3c7e

Diego Devesa commited on Jan 7

ggml-backend : only offload from host buffers (llama/11120)

1ca87a8

Diego Devesa commited on Jan 7

ggml : improve inputs log sched_print_assignments (ggml/1053)

4427ede

danbev commited on Dec 19, 2024

ggml : move AMX to the CPU backend (llama/10570)

3732429

Diego Devesa commited on Dec 3, 2024

ggml-opt: fix data corruption (ggml/1022)

a916e92

JohannesGaessler commited on Nov 20, 2024

ggml/sched : do not skip views in pre-assignments

b1eba61

slaren commited on Nov 20, 2024

ggml : sync resolve (skip) (#0)

d4d67dc

ggerganov commited on Nov 19, 2024

llama : only use default buffer types for the KV cache (llama/10358)

9e9c0ad

Diego Devesa commited on Nov 17, 2024

ggml : fix possible buffer use after free in sched reserve (llama/9930)

4703ea3

Diego Devesa commited on Nov 17, 2024

ggml: new optimization interface (ggml/988)

dd33ace

JohannesGaessler commited on Nov 16, 2024

ggml : tmp workaround for whisper.cpp (skip) (#2565)

ef26f48
unverified

ggerganov commited on Nov 16, 2024

ggml : move CPU backend to a separate file (llama/10144)

0f447f2

Diego Devesa commited on Nov 3, 2024

llama : fix buffer checks for mamba and rwk (llama/10111)

9df9767

Diego Devesa commited on Oct 31, 2024

kompute: add backend registry / device interfaces (llama/10045)

b612415

slpnix commited on Oct 30, 2024

llama : refactor model loader with backend registry (llama/10026)

582a21e

Diego Devesa commited on Oct 30, 2024

Adapt to dynamically loadable backends mechanism (llama/9970)

f8d4728

leo-pony commited on Oct 22, 2024

Add SYCL Backend registry, device and Event Interfaces (llama/9705)

f35cae5

Ouadie EL FAROUKI commited on Oct 18, 2024

add amx kernel for gemm (llama/8998)

db52137

mingfeima commited on Oct 18, 2024

vulkan : add backend registry / device interfaces (llama/9721)

df2cb6e

Diego Devesa commited on Oct 17, 2024

fix: allocating CPU buffer with size `0` (llama/9917)

ae9a15f

Gilad S commited on Oct 16, 2024

fix: use `vm_allocate` to allocate CPU backend buffer on macOS (llama/9875)

cf75979

Gilad S commited on Oct 16, 2024

ggml : move more prints to the ggml log system (llama/9839)

98d1a6a

Diego Devesa commited on Oct 11, 2024

rpc : add backend registry / device interfaces (llama/9812)

4ac768e

Diego Devesa commited on Oct 10, 2024

ggml : fix BLAS with unsupported types (llama/9775)

0a93e1b

Diego Devesa commited on Oct 8, 2024

ggml : add backend registry / device interfaces to BLAS backend (llama/9752)

7f269bb

Diego Devesa commited on Oct 7, 2024

ggml : add metal backend registry / device (llama/9713)

b6adf19

ggerganov slaren commited on Oct 7, 2024

ggml-backend : add device and backend reg interfaces (llama/9707)

9d74d85

Diego Devesa commited on Oct 3, 2024

ggml-backend : add device and backend reg interfaces (llama/9707)

1bdb50a

Diego Devesa

JohannesGaessler commited on Oct 2, 2024

Commit History

ggml : fix fallback to CPU for ununsupported ops (llama/15118) 2b7ae5e

sched : fix multiple evaluations of the same graph with pipeline parallelism (llama/14855) e9f5612

metal : fuse add, mul + add tests (llama/14596) 66ae493

vulkan: Add fusion support for RMS_NORM+MUL (llama/14366) 737f12d

sched : avoid changing cur_copy when a graph is already allocated (llama/13922) 1c0a5c0

ggml : allow CUDA graphs when using pipeline parallelism (llama/13814) b85e3c0

llama/ggml: add LLM training support (llama/10544) 8d3b3c1

Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (llama/13386) 418769d

CUDA: fix logic for clearing padding with -ngl 0 (llama/13320) c3e51a2

ggml : portability fixes for VS 2017 (llama/12150) 49e3343

ggml : upgrade init_tensor API to return a ggml_status (llama/11854) d6b6852

ggml-backend : only offload from host buffers (fix) (llama/11124) 9ac3c7e

ggml-backend : only offload from host buffers (llama/11120) 1ca87a8

ggml : improve inputs log sched_print_assignments (ggml/1053) 4427ede

ggml : move AMX to the CPU backend (llama/10570) 3732429

ggml-opt: fix data corruption (ggml/1022) a916e92

ggml/sched : do not skip views in pre-assignments b1eba61

ggml : sync resolve (skip) (#0) d4d67dc

llama : only use default buffer types for the KV cache (llama/10358) 9e9c0ad

ggml : fix possible buffer use after free in sched reserve (llama/9930) 4703ea3

ggml: new optimization interface (ggml/988) dd33ace

ggml : tmp workaround for whisper.cpp (skip) (#2565) ef26f48 unverified

ggml : move CPU backend to a separate file (llama/10144) 0f447f2

llama : fix buffer checks for mamba and rwk (llama/10111) 9df9767

kompute: add backend registry / device interfaces (llama/10045) b612415

llama : refactor model loader with backend registry (llama/10026) 582a21e

Adapt to dynamically loadable backends mechanism (llama/9970) f8d4728

Add SYCL Backend registry, device and Event Interfaces (llama/9705) f35cae5

add amx kernel for gemm (llama/8998) db52137

vulkan : add backend registry / device interfaces (llama/9721) df2cb6e

fix: allocating CPU buffer with size `0` (llama/9917) ae9a15f

fix: use `vm_allocate` to allocate CPU backend buffer on macOS (llama/9875) cf75979

ggml : move more prints to the ggml log system (llama/9839) 98d1a6a

rpc : add backend registry / device interfaces (llama/9812) 4ac768e

ggml : fix BLAS with unsupported types (llama/9775) 0a93e1b

ggml : add backend registry / device interfaces to BLAS backend (llama/9752) 7f269bb

ggml : add metal backend registry / device (llama/9713) b6adf19

ggml-backend : add device and backend reg interfaces (llama/9707) 9d74d85

ggml-backend : add device and backend reg interfaces (llama/9707) 1bdb50a