Spaces:
Running
Running
Commit History
ggml : remove oboslete alibi code (skipme) (#0)
d25c1e3
ggml : full ALiBi support (llama/7192)
192bda4
CUDA: generalize FP16 fattn vec kernel (llama/7061)
ca79691
Introduction of CUDA Graphs to LLama.cpp (llama/6766)
08fc76d
agray3
slaren
commited on
CUDA: CUDART < 11.7 workaround for __hmax, __hmax2 (llama/7019)
4cf786d
ggml : add Flash Attention (llama/5021)
34d3b03
Fix more int overflow during quant (PPL/CUDA). (llama/6563)
531387f
ggml : group all experts in a single ggml_mul_mat_id (llama/6505)
f0b5c67
feat: implemented sigmoid function (ggml/806)
cd0c122
Justina Cho
commited on