1 article tagged with this topic
Alibaba's Qwen3.6 35B-A3B runs Q8 at 170 tokens/ sec with full 260K context on dual consumer GPUs.