Commit History

ci : disable freeBSD builds [no ci]
feddf3c

ggerganov commited on

readme : update build instructions
d1e543b

ggerganov commited on

ci : disable CUDA and Android builds
fcafd21

ggerganov commited on

ci : disable Obj-C build + fixes
3859606

ggerganov commited on

make : shim cmake
15c1d58

ggerganov commited on

talk-llama : sync llama.cpp
5908a19

ggerganov commited on

sync : ggml
00d464f

ggerganov commited on

ggml : add predefined list of CPU backend variants to build (llama/10626)
1794b43

Diego Devesa commited on

ggml-cpu : fix HWCAP2_I8MM value (llama/10646)
b3e6ea8

Diego Devesa commited on

vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (llama/10642)
e9ee893

jeffbolznv commited on

SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (llama/10584)
385f335

Nicolò Scipione commited on

Avoid using __fp16 on ARM with old nvcc (llama/10616)
19743b6

Frankie Robertson commited on

vulkan: optimize and reenable split_k (llama/10637)
bca95f5

jeffbolznv commited on

ggml: add `GGML_SET` Metal kernel + i32 CPU kernel (ggml/1037)
dd775d5

PABannier commited on

ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)
154bbc0

PABannier commited on

files : remove make artifacts
d3e3ea1

ggerganov commited on

common : fix compile warning
6a0d528

ggerganov commited on

ggml : move AMX to the CPU backend (llama/10570)
3732429

Diego Devesa commited on

metal : small-batch mat-mul kernels (llama/10581)
58b0822

ggerganov commited on

SYCL: Fix and switch to GGML_LOG system instead of fprintf (llama/10579)
f083887

qnixsynapse commited on

ggml-cpu: replace AArch64 NEON assembly with intrinsics in ggml_gemv_q4_0_4x4_q8_0() (llama/10567)
1c781a8

Adrien Gallouët commited on

vulkan: Dynamic subgroup size support for Q6_K mat_vec (llama/10536)
59600b5

Eve commited on

ggml : fix I8MM Q4_1 scaling factor conversion (llama/10562)
664be9a

ggerganov commited on

ggml-cpu: fix typo in gemv/gemm iq4_nl_4_4 (llama/10580)
c7a861a

shupeif commited on

sycl : offload of get_rows set to 0 (llama/10432)
47b6bff

Alberto Cabrera Pérez commited on

sycl : Reroute permuted mul_mats through oneMKL (llama/10408)
af13def

Alberto Cabrera Pérez commited on

CANN: RoPE operator optimization (llama/10563)
3ad7b0a

Chenguang Li noemotiovon commited on

vulkan: get the first command buffer submitted sooner (llama/10499)
e1c1e73

jeffbolznv commited on

ggml : remove redundant copyright notice + update authors
c78cdd7

ggerganov commited on

ggml : fix row condition for i8mm kernels (llama/10561)
01c713f

ggerganov commited on

cmake : fix ARM feature detection (llama/10543)
c04a34f

ggerganov commited on

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)
bf73242

shupeif commited on

kompute : improve backend to pass test_backend_ops (llama/10542)
c8008b8

slpnix commited on

CANN: Fix SOC_TYPE compile bug (llama/10519)
7f24ebb

leo-pony commited on

CANN: ROPE operator optimization (llama/10540)
63ee002

Chenguang Li noemotiovon commited on

Add some minimal optimizations for CDNA (llama/10498)
bf49bbe

uvos commited on

metal : fix group_norm support condition (llama/0)
20ee62d

ggerganov commited on

vulkan: define all quant data structures in types.comp (llama/10440)
cea89af

jeffbolznv commited on

vulkan: Handle GPUs with less shared memory (llama/10468)
18a0ad1

jeffbolznv commited on

vulkan: further optimize q5_k mul_mat_vec (llama/10479)
cb018d4

jeffbolznv commited on

vulkan: skip integer div/mod in get_offsets for batch_idx==0 (llama/10506)
c6d15e0

jeffbolznv commited on

vulkan: optimize Q2_K and Q3_K mul_mat_vec (llama/10459)
c032c06

jeffbolznv commited on

mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (llama/10516)
f2a87fc

R0CKSTAR commited on

vulkan: fix group_norm (llama/10496)
8f5eeb8

jeffbolznv commited on

cmake : enable warnings in llama (llama/10474)
26a670b

ggerganov commited on

ggml-cpu: cmake add arm64 cpu feature check for macos (llama/10487)
6d586a0

Charles Xu commited on

CANN: Improve the Inferencing Performance for Ascend NPU Device (llama/10454)
f9fd6d6

Shanshan Shen shanshan shen Frank Mai commited on

CANN: RoPE and CANCAT operator optimization (llama/10488)
b357ea7

Chenguang Li noemotiovon commited on

vulkan: Fix a vulkan-shaders-gen arugment parsing error (llama/10484)
6a4b6ae

Sparkleholic commited on

metal : enable mat-vec kernels for bs <= 4 (llama/10491)
6d07dee

ggerganov commited on