Back to home

Unsloth

7 articles tagged with this topic

IBMGranite

IBM Open-Sources Granite 4.1: 21 Quantized Versions Prove Bottleneck Isn't Size

IBM open-sources Granite 4.1. A 21-version quantization test shows no quality difference: small models' bottleneck is base capability, not compression

May 52 min read
MistralUnsloth

Mistral Local GGUF Bug Fixed — Open Source QA Gaps Are Bigger Than You Think

Mistral Medium 3.5 GGUF files corrupted, community-fixed. Reveals open source QA gap: APIs tested, local formats not—impacts enterprise deployments.

May 22 min read
MistralUnsloth

Mistral 3.5 Inference Bug Fixed by Open-Source Team — LLM Delivery QA Flashing Red

Unsloth fixed a Mistral Medium 3.5 inference bug from a core config error, exposing absent QA in commercial LLMs. Beware the "community beta" business

May 22 min read
QwenUnsloth

Qwen3.6-27B Quantized Fits Single Consumer GPU: Local Deployment Sweet Spot

Unsloth Q5-quantized Qwen3.6-27B runs stably on a single RTX 5090 across 19 rounds. Mid-size model local deployment is hitting the cost-capability swe

May 12 min read
UnslothQwen3.6

Qwen3.6 GGUF Benchmarks

Un sloth claims top KLD-vs-disk-space performance for Qwen3.6-35B-A3B quants in 21 of 22 pareto frontier comparisons.

Apr 173 min read
llama.cppQwen3

GPoUr with ~12gb vram and a 3080 getting 40tg/s on qwen3.6 35BA3B w/ 260k ctx

A llama.cpp fork with turbo3 KV cache quantization achieves ~40 tok/s on Qwen3-35 B-A3B with only 12GB VRAM.

Apr 163 min read
UnslothMiniMax-M2.7

Unsloth Releases Full GGUF Quant Suite for MiniMax M2.7

Unsloth uploads 22 GGUF quantizations of MiniMax M2.7, ranging from 1-bit (60.7 GB) to BF16 (457 GB).

Apr 123 min read