Back to home
local-inference
2 articles tagged with this topic
llama .cppGemma-4
llama.cpp Adds Audio Processing Support via Gemma-4 E2 A/E4A Models
ll ama-server now supports speech-to-text via Google's Gemma-4 E2A and E4A multimodal models.
Apr 123 min read
Gemma 4llama.cpp
Gemma 4 Local CUDA Setup: Precision Traps and Real Benchmarks
Running Gemma 4 locally on CUDA requires strict dtype matching at KV cache boundaries or output degenerates silently.
Apr 72 min read