Spaces:
Running
Running
Commit History
vulkan: use uint array index to avoid glslang bug (llama/13193) fd2d86d
ggml : fix ppc64le build (llama/13176) 07ec79f
feat(ggml-cpu): enable z17 compile (llama/13182) 10f7d18
Aaron Teo commited on
CUDA: fix non-cont. inputs for batched mat mul (llama/13155) d13b876
fix(rpc): Improve input validation and error handling (llama/13069) 9e9f2fe
Ville Vesilehto commited on
SYCL: Add all missing unary kernels (llama/13074) d2ce872
Akarshan Biswas commited on
musa: fix typo in cc control (llama/13144) 5fb7320
R0CKSTAR commited on
CUDA: fix q_nope_absorbed prec for DS 2 Lite f16 (llama/13137) e9c9d4b
musa: fix build warning (llama/13129) 3436ba4
R0CKSTAR commited on
ggml: move fp16/bf16 conversion optimizations to CPU backend + export conversion APIs (llama/13107) c47823e
change the reorder tensor from init to execute OP (llama/13003) 8614863
Neo Zhang Jianyu commited on
rpc : do not wait for response when sending RPC_CMD_SET_TENSOR (llama/12943) 691c071
ggml : fix ggml_gallocr_ptr type (ggml/1205) cf46d5c
Diego Devesa commited on
whisper : add check that target name exists (#3103) 60ff3ed unverified
server : add --no-gpu option to print usage output (#3098) 1eb0f64 unverified
ruby : ignore "Downloading" output in test_log_suppress (#3106) fdb6c7e unverified
make : fix samples glob pattern (#3100) 0a9e5b1 unverified
ggml : suppress Windows compiler warnings (#3075) 887f7a2 unverified
whisper : fix grammar advance stack warning (#3087) e4a0565 unverified
examples : expose language detection probabilities to server example (#3044) 6b8d348 unverified
whisper : remove empty .gitmodules file [no ci] (#3085) aa54166 unverified
talk-llama : sync llama.cpp (#3084) 511930c unverified
ci : disable publishing of java binding [no ci] (#3086) 4b6e041 unverified
build : Add Moore Threads GPU support and update GitHub workflow for MUSA build (#3069) 8ede9a1 unverified
R0CKSTAR commited on
examples : fix deprecated FFmpeg functions (#3073) 0aa41e8 unverified
ruby : add encoder begin callback related methods (#3076) 855927b unverified
ci : enable bindings java job (#3070) 469f43c unverified
ruby : add cmake option (#0) 4a21ad6
cuda : fix unused variable compile warning (#0) a1f4201
sync : ggml 5222212
opencl : remove obsolete files (skip) (ggml/1200) adc6542
sync : ggml cac9245
opencl: split ggml-opencl.cl into multiple files and cleanup (llama/12886) 291a5b7
lhez Shangqing Gu commited on
ggml : fix trailing whitespaces (llama/0) 5d27bbf
CUDA: use switch statements in constexpr functions (llama/13095) f5cd546
metal : fix floating-point range of attention scores in FA kernels (llama/13090) e093044
vulkan: matmul gcn tuning (llama/13016) ac537d2
CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (llama/13014) 285a334
ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (llama/12871) f8795d3
Diego Devesa commited on
SYCL: Add non-contiguous support in ROPE (llama/12993) a29a2c3
Akarshan Biswas commited on
vulkan: support noncontiguous rms_norm (llama/13031) e4d1f59
metal: add neg operator (llama/13029) 42283e1
SYCL: Refactor and enable FP16 in binary broadcast OPs (llama/12975) 1377b05
Akarshan Biswas commited on
rpc : add RPC_CMD_HELLO (llama/12955) ff22836
graph : make FA compatible with MLA + add initial Metal kernels (llama/12953) fb0d243
ggml: Re-enable CUDA graphs in presence of CONT and DUP nodes (llama/12970) 3944ae5
Alan Gray commited on
CANN: Add support for async operator submission (llama/12864) 1b9d0f0
opencl: fix incorrect local_size index in profiling log (llama/12868) 8f5d919
kimminsu commited on