Back to home
Unsloth
5 articles tagged with this topic
UnslothQwen3.6
Qwen3.6 GGUF Benchmarks
Un sloth claims top KLD-vs-disk-space performance for Qwen3.6-35B-A3B quants in 21 of 22 pareto frontier comparisons.
Apr 173 min read
llama.cppQwen3
GPoUr with ~12gb vram and a 3080 getting 40tg/s on qwen3.6 35BA3B w/ 260k ctx
A llama.cpp fork with turbo3 KV cache quantization achieves ~40 tok/s on Qwen3-35 B-A3B with only 12GB VRAM.
Apr 163 min read
UnslothMiniMax-M2.7
Unsloth Releases Full GGUF Quant Suite for MiniMax M2.7
Unsloth uploads 22 GGUF quantizations of MiniMax M2.7, ranging from 1-bit (60.7 GB) to BF16 (457 GB).
Apr 123 min read
llama.cppDistributed Training
Local Inference vs Distributed Training: Where the Real Gap Is
Indie devs run models locally, but training still requires datacenter scale. Can distributed training ever close that gap?
Apr 52 min read
Gemma 4Qwen3.5
Gemma 4 27B vs Qwen 3.5 27B: SVG Generation Benchmark
Reddit users compare Gemma 4 31B and Qwen 3.5 27B Q4 quants on SVG creation, coding, and function calling tasks.
Apr 52 min read