Quantization

找到 2 篇关于此标签的文章

llama.cppllama-bench

llama.cpp llama-bench Adds -fitc and -fitt Benchmark Flags

llama-bench gains -fitc and -fitt flags from build b4679, enabling finer control over benchmark timing output.

llama.cppQwen Coder

APEX 量化与 K-Quants：为何 MoE 编码模型需要不同的压缩策略

APEX 量化针对 MoE 架构的连贯性层采用 Q8 精度，在跨文件编码代理任务中表现优于通用的 K-quants 方法。