magiccodingman's picture
File name changes
5b3b88a verified

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 2 CUDA devices: Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes Device 1: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes

model size params backend ngl test t/s
qwen3 4B BF16 7.49 GiB 4.02 B CUDA 35 pp8 254.70 ± 2.70
qwen3 4B BF16 7.49 GiB 4.02 B CUDA 35 tg128 33.31 ± 0.12

build: 92bb442ad (7040)