Quantization

2 articles tagged with this topic

llama.cppllama-bench

llama.cpp llama-bench Adds -fitc and -fitt Benchmark Flags

llama-bench gains -fitc and -fitt flags from build b4679, enabling finer control over benchmark timing output.

Apr 61 min read

llama.cppQwen Coder

APEX Quantization vs K-Quants: Why MoE Coding Models Need Different Compression

APEX quantization targets MoE architecture coherence layers at Q8, outperforming generic K-quants for multi-file coding agents.

Apr 62 min read