Back to home
quantization
2 articles tagged with this topic
NVIDIAGemma
NVIDIA NVFP4 Puts 26B Model on Consumer GPU With Under 1% Accuracy Loss
NVIDIA's NVFP4 Gemma-4-26B shrinks to 18.8GB for consumer GPUs with <0.7% accuracy loss. 4-bit is now optimal, but also an ecosystem lock-in.
May 12 min read
Qwen3.5GGUF
Qwen3.5-9B GGUF Quant Rankings: Q8_0 Dominates KLD Scores
KLD benchmarks across community GGUF quants show Q8_0 variants cluster near 0.001 KLD, with quality degrading shar ply below Q5.
Apr 143 min read