local-LLM

找到 2 篇关于此标签的文章

Gemma 4llama.cpp

Fixing Gemma 4 Tool Calls in llama.cpp: Root Causes Explained

Four bugs in llama.cpp's Gemma 4 chat template handling caused tool call results to crash or loop.

Qwen3-Coderllama.cpp

Run Qwen3-Coder 80B Locally at 54GB With Apex Quantization

A community GGUF quantization shrinks Qwen3-Coder 80B to 54.1GB, making fast local coding inference practical.