找到 1 篇关于此标签的文章
Alibaba's Qwen team releases a 35B sparse MoE model with only 3B active params under Apache 2.0.