ggml : fix fallback to CPU for ununsupported ops (llama/15118) 2b7ae5e Diego Devesa commited on Aug 6
sched : fix multiple evaluations of the same graph with pipeline parallelism (llama/14855) e9f5612 Diego Devesa commited on Jul 25
vulkan: Add fusion support for RMS_NORM+MUL (llama/14366) 737f12d jeffbolznv slaren commited on Jun 29
sched : avoid changing cur_copy when a graph is already allocated (llama/13922) 1c0a5c0 Diego Devesa commited on May 30
ggml : allow CUDA graphs when using pipeline parallelism (llama/13814) b85e3c0 Diego Devesa commited on May 27
Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (llama/13386) 418769d David Huang commited on May 11
CUDA: fix logic for clearing padding with -ngl 0 (llama/13320) c3e51a2 JohannesGaessler commited on May 5
ggml : portability fixes for VS 2017 (llama/12150) 49e3343 mgroeber9110 Marcus Groeber commited on Mar 4
ggml : upgrade init_tensor API to return a ggml_status (llama/11854) d6b6852 William Tambellini slaren commited on Feb 28
ggml-backend : only offload from host buffers (fix) (llama/11124) 9ac3c7e Diego Devesa commited on Jan 7
ggml : improve inputs log sched_print_assignments (ggml/1053) 4427ede danbev commited on Dec 19, 2024
llama : only use default buffer types for the KV cache (llama/10358) 9e9c0ad Diego Devesa commited on Nov 17, 2024
ggml : fix possible buffer use after free in sched reserve (llama/9930) 4703ea3 Diego Devesa commited on Nov 17, 2024
ggml : tmp workaround for whisper.cpp (skip) (#2565) ef26f48 unverified ggerganov commited on Nov 16, 2024
ggml : move CPU backend to a separate file (llama/10144) 0f447f2 Diego Devesa commited on Nov 3, 2024
llama : fix buffer checks for mamba and rwk (llama/10111) 9df9767 Diego Devesa commited on Oct 31, 2024
kompute: add backend registry / device interfaces (llama/10045) b612415 slpnix commited on Oct 30, 2024
llama : refactor model loader with backend registry (llama/10026) 582a21e Diego Devesa commited on Oct 30, 2024
Adapt to dynamically loadable backends mechanism (llama/9970) f8d4728 leo-pony commited on Oct 22, 2024
Add SYCL Backend registry, device and Event Interfaces (llama/9705) f35cae5 Ouadie EL FAROUKI commited on Oct 18, 2024
vulkan : add backend registry / device interfaces (llama/9721) df2cb6e Diego Devesa commited on Oct 17, 2024
fix: use `vm_allocate` to allocate CPU backend buffer on macOS (llama/9875) cf75979 Gilad S commited on Oct 16, 2024
ggml : move more prints to the ggml log system (llama/9839) 98d1a6a Diego Devesa commited on Oct 11, 2024
rpc : add backend registry / device interfaces (llama/9812) 4ac768e Diego Devesa commited on Oct 10, 2024
ggml : add backend registry / device interfaces to BLAS backend (llama/9752) 7f269bb Diego Devesa commited on Oct 7, 2024
ggml : add metal backend registry / device (llama/9713) b6adf19 ggerganov slaren commited on Oct 7, 2024
ggml-backend : add device and backend reg interfaces (llama/9707) 9d74d85 Diego Devesa commited on Oct 3, 2024
ggml-backend : add device and backend reg interfaces (llama/9707) 1bdb50a Diego Devesa JohannesGaessler commited on Oct 2, 2024