1 article tagged with this topic
Alibaba's Qwen team releases a 35B sparse MoE model with only 3B active params under Apache 2.0.