RTX-5090

1 article tagged with this topic

Qwen 3.6 is the first local model that actually feels worth the effort for me

Alibaba's Qwen3.6 35B-A3B runs Q8 at 170 tokens/ sec with full 260K context on dual consumer GPUs.