JohannesGaessler's picture
CUDA: faster q8_0 -> f16 dequantization (llama/4895)
0a1a178 unverified