Back to home
Model Quantization
4 articles tagged with this topic
LoRALLM Fine-Tuning
Updating 1% Params: Fine-Tuning & Quantization Slash Custom LLM Deployment Barriers
Fine-tuning turns LLMs into specialists; quantization trims them down. LoRA updates just 1% of params, enabling SMEs to customize AI with consumer GPU
May 52 min read
APEXQwen
APEX Quantizes 25 Models: 10B-Param AI on Home GPUs Flattens Compute Barrier
APEX quantizes 25+ MoE models with new I-Nano tier. 10B-param AI now runs on single consumer GPUs, slashing local deployment costs.
May 51 min read
QATModel Quantization
AI Quantization Ditches Full Downgrades for Mixed-Precision Topology
16-to-8-bit AI shifts crash precision. A new "equivalent topology" uses an 8-bit base, upgrading sensitive layers to 16-bit, balancing speed and preci
May 12 min read
QwenUnsloth
Qwen3.6-27B Quantized Fits Single Consumer GPU: Local Deployment Sweet Spot
Unsloth Q5-quantized Qwen3.6-27B runs stably on a single RTX 5090 across 19 rounds. Mid-size model local deployment is hitting the cost-capability swe
May 12 min read