r/LocalLLaMA

4 articles tagged with this topic

Independent KV Cache Evaluation SDK Signals Shift to Inference Infrastructure

KV cache dominates VRAM in long-context inference. An independent evaluation SDK for TurboQuant signals the shift from "can it run?" to "how to run st

May 52 min read

OpenAIGPT-5.5

GPT-5.5 CoT Leak: OpenAI Uses 'Caveman Language' to Slash Inference Costs

GPT-5.5's internal CoT was intercepted—output is all telegraphic shorthand. Mirrors r/LocalLLaMA's 5-month-old "caveman CoT saves tokens" idea. OpenAI

May 32 min read

OpenCodeOllama

Developers Hunt Fully Offline AI Coding Tools: Code Privacy Anxiety Spreads

OpenCode privacy risks spark r/LocalLLaMA rush for fully offline AI coding tools. Code privacy is now every developer's reality, not just a compliance

May 32 min read

r/LocalLLaMAReddit

r/LocalLLaMA's New Rules Work in a Week: Marketing Spam Finally Cleaned Up

r/LocalLLaMA's new karma thresholds and auto-mod slashed user reports in a week. Open-source AI is shifting from wild growth to governance: signal ove

May 22 min read