Unsloth

5 articles tagged with this topic

Qwen3.6 GGUF Benchmarks

Un sloth claims top KLD-vs-disk-space performance for Qwen3.6-35B-A3B quants in 21 of 22 pareto frontier comparisons.

Apr 173 min read

GPoUr with ~12gb vram and a 3080 getting 40tg/s on qwen3.6 35BA3B w/ 260k ctx

A llama.cpp fork with turbo3 KV cache quantization achieves ~40 tok/s on Qwen3-35 B-A3B with only 12GB VRAM.

Apr 163 min read

UnslothMiniMax-M2.7

Unsloth Releases Full GGUF Quant Suite for MiniMax M2.7

Unsloth uploads 22 GGUF quantizations of MiniMax M2.7, ranging from 1-bit (60.7 GB) to BF16 (457 GB).

Apr 123 min read

llama.cppDistributed Training

Local Inference vs Distributed Training: Where the Real Gap Is

Indie devs run models locally, but training still requires datacenter scale. Can distributed training ever close that gap?

Apr 52 min read

Gemma 4 27B vs Qwen 3.5 27B: SVG Generation Benchmark

Reddit users compare Gemma 4 31B and Qwen 3.5 27B Q4 quants on SVG creation, coding, and function calling tasks.

Apr 52 min read