Spaces:
Sleeping
Sleeping
Commit History
Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (llama/13386)
418769d
David Huang
commited on
CUDA: fix logic for clearing padding with -ngl 0 (llama/13320)
c3e51a2
ggml : upgrade init_tensor API to return a ggml_status (llama/11854)
d6b6852
William Tambellini
slaren
commited on
rpc : early register backend devices (llama/11262)
4134077
ggml: load all backends from a user-provided search path (llama/10699)
c6de218
Gilad S
Diego Devesa
commited on
ggml : add support for dynamic loading of backends (llama/10469)
b73266f
ggml: new optimization interface (ggml/988)
dd33ace
ggml : build backends as libraries (llama/10256)
3dc93f3
ggml : move CPU backend to a separate file (llama/10144)
0f447f2
Diego Devesa
commited on
llama : refactor model loader with backend registry (llama/10026)
582a21e
Diego Devesa
commited on
ggml : add backend registry / device interfaces to BLAS backend (llama/9752)
7f269bb
Diego Devesa
commited on
ggml : add metal backend registry / device (llama/9713)
b6adf19
ggml-backend : add device and backend reg interfaces (llama/9707)
9d74d85
Diego Devesa
commited on