Commit History

sync : ggml
cb5b2be
unverified

ggerganov commited on

ggml : resolve merge conflicts (ggml/0)
7ee6ffa
unverified

ggerganov commited on

common : add IQ1_S (ggml/0)
39c054e
unverified

ggerganov commited on

ci : enable -Werror for CUDA builds (llama/5579)
df03a10
unverified

ggerganov commited on

cuda, metal : fix nans in soft_max (llama/5574)
44164ac
unverified

slaren ggerganov commited on

ggml : android and old glibc NUMA incompatibility bugfixes (llama/5557)
0206c2d
unverified

bmwl root commited on

ggml : restore vec dot stride arg names (llama/5453)
de4041f
unverified

ggerganov commited on

ci : fix wikitext url + compile warnings (llama/5569)
49f0106
unverified

ggerganov commited on

metal : fix unused warnings (llama/0)
d12cda5
unverified

ggerganov commited on

ggml, common, examples, tests : fixed type arguments in printf (llama/5528)
2f3a004
unverified

germanaizek commited on

1.5 bit quantization (llama/5453)
9c3aa6a
unverified

Kawrakow ikawrakow commited on

ggml : add ALiBi support for ggml_soft_max_ext (llama/5488)
26c019a
unverified

ggerganov commited on

ci : add an option to fail on compile warning (llama/3952)
b5903fc
unverified

abastola ggerganov commited on

cmake : fix VULKAN and ROCm builds (llama/5525)
ae570e4
unverified

ggerganov commited on

cuda : print message when initialization fails (llama/5512)
1f047ca
unverified

slaren commited on

vulkan: Find optimal memory type but with fallback (llama/5381)
24e2319
unverified

lcfrs commited on

Early return for zero size calls to get_tensor. (llama/5482)
f1f5c00
unverified

AT ggerganov commited on

ggml-quants : fix compiler warnings (shadow variable) (llama/5472)
e538f25
unverified

Kawrakow ikawrakow commited on

ggml-sycl: Replace 3d ops with macro (llama/5458)
12970f1
unverified

Abhilash Majumder commited on

build : update CBLAS flags + fix unused var warning (#0)
496c0f1
unverified

ggerganov commited on

main : check if input files exist before proceeding (#1872)
d625238
unverified

Theldus commited on

examples : clean up common code (#1871)
da3cdf4
unverified

felrock commited on

models : fix openvino setup info (#1874)
7d4b654
unverified

Jumper775 commited on

models : add update py requirements
a60f965
unverified

ggerganov commited on

swift : package no longer use ggml dependency (#1861)
df6227e
unverified

ggerganov commited on

whisper : fix external encoder (#1860)
3538ca9
unverified

ggerganov commited on

sync : ggml
f0a0087
unverified

ggerganov commited on

ggml-alloc : allocate all leafs as if they were inputs (ggml/731)
a512417
unverified

slaren commited on

talk-llama : sync llama.cpp
aa42df9
unverified

ggerganov commited on

sync : ggml
be7d266
unverified

ggerganov commited on

ggml-backend : sync remnant
3f5165f
unverified

ggerganov commited on

CUDA: mul_mat_vec_q tiling, refactor mul mat logic (llama/5434)
c0cfa9b
unverified

JohannesGaessler slaren commited on

vulkan: only use M-sized matmul on Apple GPUs (llama/5412)
350284e
unverified

Sergio López commited on

ggml : fix compile warnings (unused vars) (llama/4966)
97fa2e3
unverified

ggerganov commited on

ggml : add mmla kernels for quantized GEMM (llama/4966)
0d50a29
unverified

snadampal commited on

metal : use autoreleasepool to avoid memory leaks (llama/5437)
c276f12
unverified

irbull commited on

ggml-alloc : v3 (ggml/727)
5cffd6f
unverified

slaren commited on

examples : added audio_ctx argument to main and server (#1857)
469988b
unverified

dscripka ggerganov commited on

metal : option to embed MSL source into compiled binary (#1842)
a46b62a
unverified

Didzis Gosko commited on

examples : initialize context params properly (#1852)
3443ee7
unverified

ggerganov commited on

talk-llama : sync llama.cpp
e6d6e1d
unverified

ggerganov commited on

sync : ggml
94800c5
unverified

ggerganov commited on

src : relocate new backend sources
44cd2d4
unverified

ggerganov commited on

ggml : fix `error C2078: too many initializers` for MSVC ARM64 (llama/5404)
8ebb36c
unverified

Michael Podvitskiy commited on

CUDA: more warps for mmvq on NVIDIA (llama/5394)
7ab774c
unverified

JohannesGaessler commited on

CUDA: fixed mmvq kernel for bs 2,3,4 and -sm row (llama/5386)
3ff7660
unverified

JohannesGaessler commited on

Basic Vulkan Multi-GPU implementation (llama/5321)
5d130aa
unverified

OccamRazor slaren commited on

CUDA: mul_mat_vec_q max. batch size 8 -> 4 (llama/5370)
7aa3216
unverified

JohannesGaessler commited on

Slight quantization improvement for Q4_K and Q5_K (llama/5361)
e3cd020
unverified

Kawrakow ikawrakow commited on