Apple-Silicon

找到 3 篇关于此标签的文章

DFlash 投机解码登陆 Apple Silicon：Qwen3.5-9B 实现 4.1 倍加速，现已开源（MLX，M5 Max）

开源项目 DFlash 在 M5 Max 上借助 MLX 实现 Qwen3.5-9B 推理 4.13 倍加速，token 接受率高达 89.4%。

Full- stack guide: Ollama + OrbStack + frp + Nginx exposes local Gemma 4 inference to the public internet via HTTPS.

社区贡献者发布 MiniMax-M2.7（229B MoE）的首批 GGUF 量化版本，提供 Q3_K_L（110GB）和 Q8_0（243GB）两种规格，现已上架 HuggingFace。