Spaces:
Running
Running
Commit History
CUDA: fix bad asserts for partial offload (llama/13337)
23e676b
CUDA: fix q_nope_absorbed prec for DS 2 Lite f16 (llama/13137)
e9c9d4b
ggml : Depthwise 2D convolution (ggml/1152)
0c950d5
ggml : add bilinear upscale support (ggml/1185)
4c5e449
Diego Devesa
commited on
ggml : add more generic custom op, remove deprecated custom ops (ggml/1183)
ba7a5f8
Diego Devesa
commited on
metal : improve FA + improve MoE (llama/12612)
04a3389
llama: Add support for RWKV v7 architecture (llama/12412)
727de7e
ggml : portability fixes for VS 2017 (llama/12150)
49e3343
mgroeber9110
Marcus Groeber
commited on
cleanup: fix compile warnings associated with gnu_printf (llama/11811)
ef6a968
bandoti
commited on
CUDA: use mma PTX instructions for FlashAttention (llama/11583)
f328957
CUDA: backwards pass for misc. ops, add tests (llama/11257)
2fbcec1
RoPE: fix back, CUDA support for back + noncont. (llama/11240)
131a21e
GGUF: C++ refactor, backend support, misc fixes (llama/11030)
21c5b64
tts : add OuteTTS support (llama/10784)
8d0f0ac
ggml : refactor online repacking (llama/10446)
163128e
ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)
154bbc0
ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)
bf73242
ggml : add support for dynamic loading of backends (llama/10469)
b73266f
ggml: new optimization interface (ggml/988)
dd33ace
ggml : build backends as libraries (llama/10256)
3dc93f3
metal : optimize FA kernels (llama/10171)
44ff932
ggml : move CPU backend to a separate file (llama/10144)
0f447f2
Diego Devesa
commited on
llama : add simple-chat example (llama/10124)
41ff26f
Diego Devesa
Xuan Son Nguyen
commited on
ggml : remove ggml_scratch (llama/10121)
3f0b7ba
add amx kernel for gemm (llama/8998)
db52137
ggml : fix BLAS with unsupported types (llama/9775)
0a93e1b
Diego Devesa
commited on
ggml : alloc ggml_contexts on the heap (#2525)
3ccf40a
unverified
ggml-backend : add device and backend reg interfaces (llama/9707)
9d74d85
Diego Devesa
commited on
ggml-backend : add device and backend reg interfaces (llama/9707)
1bdb50a
ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980)
52069b8
test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974)
76aa810
ggml: fix gradient allocation logic (ggml/966)
ad3f29d
ggml : add run-time detection of neon, i8mm and sve (llama/9331)
12c0e23
Dan Johansson
commited on
ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969)
ad34655
log : add CONT level for continuing previous log entry (llama/9610)
a29a4c5
examples : adapt to ggml.h changes (ggml/0)
91c7734
ggml : refactoring (llama/#0)
1b62c96
common : reimplement logging (llama/9418)
e893c97
riscv : modify Makefile and add a RISCV_VECT to print log info (llama/9442)
f77ad34
Ahmad Tameem
commited on