make/cmake: add missing force MMQ/cuBLAS for HIP (llama/8515) 5096c91 JohannesGaessler commited on Jul 16, 2024
Refactor lora adapter support (llama/8332) 76bcfc6 Xuan Son Nguyen slaren compilade commited on Jul 15, 2024
metal : template-ify some of the kernels (llama/8447) 3c3094f ggerganov HF Staff commited on Jul 13, 2024
ggml : add NVPL BLAS support (ggml/8329) (llama/8425) 4816a87 ntukanov ntukanov commited on Jul 11, 2024
cuda : suppress 'noreturn' warn in no_device_code (llama/8414) 13c1163 danbev commited on Jul 11, 2024
Use multi_ptr to clean up deprecated warnings (llama/8256) 6dbe297 AidanBeltonS commited on Jul 10, 2024
ggml : move sgemm sources to llamafile subfolder (llama/8394) 1554348 ggerganov HF Staff commited on Jul 10, 2024
ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (llama/5780) 9509586 Dibakar Gope commited on Jul 10, 2024
sycl : Reenabled mmvq path for the SYCL Nvidia Backend (llama/8372) b969571 Alberto Cabrera Pérez commited on Jul 9, 2024
sycl : fix powf call in device code (llama/8368) 011fbfd Alberto Cabrera Pérez commited on Jul 8, 2024
ggml : loop tiling optimizations for scalar path (ggml/898) 1c4b0ca Mahesh Madhav commited on Jul 25, 2024
ggml: add support for float16 input tensors in pooling operations (ggml/895) 8248d8e Ivan Filipov vanaka11 commited on Jul 22, 2024
vulkan : initialize vk_buffer_struct members to VK_NULL_HANDLE (ggml/893) 8c409e3 Tony Wasserka Tony Wasserka commited on Jul 20, 2024
cmake : only enable GGML_NATIVE and x86 flags if not crosscompiling (ggml/885) 0456299 stanimirovb commited on Jul 12, 2024
whisper : use vulkan as gpu backend when available (#2302) 0755fa0 unverified Matt Stephenson commited on Jul 16, 2024
cmake : use WHISPER_EXTRA_FLAGS (#2294) 81fa005 unverified ggerganov HF Staff commited on Jul 9, 2024
cmake : try to fix openvino build (#2281) 7b043ae unverified ggerganov HF Staff commited on Jul 8, 2024
cmake : remove install of llama convert script [no ci] (#2266) f73ff9a ggerganov HF Staff commited on Jul 8, 2024
cmake : add GGML_BUILD and GGML_SHARED macro definitions (llama/8281) a8f9bda KafuuChino commited on Jul 5, 2024
Enabled more data types for oneMKL gemm_batch (llama/8236) 08501f8 Ouadie EL FAROUKI commited on Jul 5, 2024
CUDA: fix MMQ stream-k rounding if ne00 % 128 != 0 (llama/8311) 04d4209 JohannesGaessler commited on Jul 5, 2024
rm get_work_group_size() by local cache for performance (llama/8286) 08fd758 Neo Zhang Jianyu arthw commited on Jul 5, 2024
Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (llama/8258) cc49462 HanClinto commited on Jul 2, 2024
cuda : update supports_op for matrix multiplication (llama/8245) 2314334 slaren commited on Jul 2, 2024