ggml-quants : ternary packing for TriLMs and BitNet b1.58 (llama/8151) d1c244a compilade commited on Sep 6, 2024
llama : support RWKV v6 models (llama/8980) bd4f5ec mollysama Layl Bongers compilade ggerganov commited on Sep 1, 2024
Threadpool: take 2 (llama/8672) e3e9ca4 Faisal Zaghloul Max Krasnyansky quic-fzaghlou Max Krasnyansky slaren commited on Aug 29, 2024
tests: add gradient tests for all backends (ggml/932) 4751b2f JohannesGaessler commited on Sep 3, 2024
docker : add libsdl2-dev for container builds (#2424) aa93432 unverified JohnnyB commited on Sep 20, 2024
go : add tests and update bindings (#2425) c80d17a unverified Stavros Panakakis commited on Sep 20, 2024
server : use OS-generated temp file name for converted files (#2419) 04d9c8d unverified teejae commited on Sep 17, 2024
cmake: Fix libdir value in pkgconfig file (#2407) a048ef3 unverified Philippe Normand commited on Sep 7, 2024
revert : cmake : set MSVC to use UTF-8 on source files (#2346) 5e9ff52 ggerganov commited on Sep 2, 2024
ggml: fix ggml_graph_cpy undefined behavior (ggml/943) 9202e70 JohannesGaessler commited on Aug 31, 2024
ggml : fix cont with transposed tensors when one dimension is 1 (ggml/934) 33c59fc smeso ggerganov commited on Aug 28, 2024
cmake : set MSVC to use UTF-8 on source files (#2346) 9b3df8e unverified Tim Miller commited on Aug 30, 2024
readme : remove invalid flag from Python example (#2396) 5372e8b unverified UsernamesLame commited on Aug 30, 2024
go : add beamsize/entropythold/maxcontext to context interface (#2350) 7efcda7 unverified hsinhoyeh commited on Aug 28, 2024
ggml : do not crash when quantizing q4_x_x with an imatrix (llama/9192) d64f932 slaren commited on Aug 26, 2024
metal : separate scale and mask from QKT in FA kernel (llama/9189) 90cc3cd ggerganov commited on Aug 26, 2024
CPU/CUDA: Gemma 2 FlashAttention support (llama/8542) fb8ae8b JohannesGaessler commited on Aug 24, 2024
llama : simplify Mamba with advanced batch splits (llama/8526) f1abcb4 compilade ggerganov commited on Aug 21, 2024
Fix SYCL `im2col` and `convert` Overflow with Large Dims (llama/9052) 5f43886 zhentaoyu commited on Aug 20, 2024
rpc : print error message when failed to connect endpoint (llama/9042) d54b156 rgerganov commited on Aug 19, 2024
ggml : dynamic ggml_sched_max_splits based on graph_size (llama/9047) e0dc1ad nicoboss commited on Aug 16, 2024