Spaces:
Running
Running
Commit History
llama/ggml: add LLM training support (llama/10544)
8d3b3c1
CUDA: fix bad asserts for partial offload (llama/13337)
23e676b
ggml: move fp16/bf16 conversion optimizations to CPU backend + export conversion APIs (llama/13107)
c47823e
ggml : fix trailing whitespaces (llama/0)
5d27bbf
ggml : Depthwise 2D convolution (ggml/1152)
0c950d5
ggml : add bilinear upscale support (ggml/1185)
4c5e449
Diego Devesa
commited on
ggml : add more generic custom op, remove deprecated custom ops (ggml/1183)
ba7a5f8
Diego Devesa
commited on
llama : add option to override model tensor buffers (llama/11397)
3d000b6
Diego Devesa
commited on
metal : improve FA + improve MoE (llama/12612)
04a3389
llama: Add support for RWKV v7 architecture (llama/12412)
727de7e
ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/1118)
c9a49f9
vmobilis
commited on
ggml : portability fixes for VS 2017 (llama/12150)
49e3343
mgroeber9110
Marcus Groeber
commited on
ggml-cpu: Support s390x SIMD Instruction Set (llama/12019)
4aa54ec
Aaron Teo
Jinyang He
junchao-zhao
commited on
fix: typos in documentation files (llama/11791)
5c6d350
Maxim Evtush
commited on
CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380)
855a9fe
CUDA: backwards pass for misc. ops, add tests (llama/11257)
2fbcec1
RoPE: fix back, CUDA support for back + noncont. (llama/11240)
131a21e
ggml : add option to not print stack on abort (ggml/1081)
9b2706e
William Tambellini
Diego Devesa
commited on
GGUF: C++ refactor, backend support, misc fixes (llama/11030)
21c5b64
tts : add OuteTTS support (llama/10784)
8d0f0ac
tests: add tests for GGUF (llama/10830)
e7722cb
ggml : add check for grad_accs (ggml/1046)
eacc95c
ggml : refactor online repacking (llama/10446)
163128e
ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)
154bbc0
ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)
bf73242
ggml : add support for dynamic loading of backends (llama/10469)
b73266f
cuda : optimize argmax (llama/10441)
69ae50d
ggml-opt: fix data corruption (ggml/1022)
a916e92
ggml : fix compile warnings (llama/0)
80d6ec0
ggml: new optimization interface (ggml/988)
dd33ace
ggml : fix some build issues
c5ba1d1
slaren
commited on
ggml : build backends as libraries (llama/10256)
3dc93f3
metal : optimize FA kernels (llama/10171)
44ff932
ggml : adjust is_first_call init value (llama/10193)
7e2b09b
ggml : fix arch check in bf16_to_fp32 (llama/10164)
09e4a9b
Diego Devesa
commited on
ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (llama/10167)
ba20d5c
Diego Devesa
commited on
ggml : move CPU backend to a separate file (llama/10144)
0f447f2
Diego Devesa
commited on
ggml : remove ggml_scratch (llama/10121)
3f0b7ba
llama : fix buffer checks for mamba and rwk (llama/10111)
9df9767
Diego Devesa
commited on
ggml : check tensor name lengths in gguf files (llama/10100)
0b78224
Diego Devesa
commited on
ggml : fix memory leaks when loading invalid gguf files (llama/10094)
f9baffc
Diego Devesa
commited on
llama : refactor model loader with backend registry (llama/10026)
582a21e
Diego Devesa
commited on
CUDA: fix MMQ for non-contiguous src0, add tests (llama/10021)
bcbaad3
ggml : add asserts for type conversion in fattn kernels (llama/9971)
9542e42
add amx kernel for gemm (llama/8998)
db52137
fix: use `vm_allocate` to allocate CPU backend buffer on macOS (llama/9875)
cf75979
Gilad S
commited on