Back to home
r/LocalLLaMA
4 articles tagged with this topic
TurboQuantKV Cache
Independent KV Cache Evaluation SDK Signals Shift to Inference Infrastructure
KV cache dominates VRAM in long-context inference. An independent evaluation SDK for TurboQuant signals the shift from "can it run?" to "how to run st
May 52 min read
OpenAIGPT-5.5
GPT-5.5 CoT Leak: OpenAI Uses 'Caveman Language' to Slash Inference Costs
GPT-5.5's internal CoT was intercepted—output is all telegraphic shorthand. Mirrors r/LocalLLaMA's 5-month-old "caveman CoT saves tokens" idea. OpenAI
May 32 min read
OpenCodeOllama
Developers Hunt Fully Offline AI Coding Tools: Code Privacy Anxiety Spreads
OpenCode privacy risks spark r/LocalLLaMA rush for fully offline AI coding tools. Code privacy is now every developer's reality, not just a compliance
May 32 min read
r/LocalLLaMAReddit
r/LocalLLaMA's New Rules Work in a Week: Marketing Spam Finally Cleaned Up
r/LocalLLaMA's new karma thresholds and auto-mod slashed user reports in a week. Open-source AI is shifting from wild growth to governance: signal ove
May 22 min read