Back to home

Local LLM

8 articles tagged with this topic

AMDStrix Halo

AMD Strix Halo Rumored at 192GB: Local LLM Hardware Bottleneck is Loosening

AMD's next-gen Strix Halo rumored with 192GB unified memory can run 122B LLMs locally. Breaking this memory bottleneck reshapes enterprise private AI

May 43 min read
AMDHalo Box

AMD's 128GB Halo Box Prototype Challenges Apple Mac's Local LLM Dominance

AMD's Halo Box prototype (Ryzen 395 + 128GB) gives x86 Mac Studio-rivaling local LLM capacity. We see the local AI inference hardware landscape shifti

Apr 301 min read
AMDLM Studio

一个 Reddit 帖子揭示的真相:本地跑 AI 大模型,硬件门槛比厂商说的要高得多

A user's 24GB AMD mini PC could only allocate 8GB VRAM to AI. The fix isn 't simple—and that gap exposes a wider industry problem .

Apr 203 min read
Local LLMCompute Cost

llama.cpp Tensor Parallelism Breakthrough: Local AI Compute Barrier Drops Another Level

Multi-GPU local inference enables enterprises to run LLMs without cloud dependency. Private deployment compute costs and technical barriers decline si

Apr 92 min read
llama.cppllama-bench

llama.cpp llama-bench Adds -fitc and -fitt Benchmark Flags

llama-bench gains -fitc and -fitt flags from build b4679, enabling finer control over benchmark timing output.

Apr 61 min read
MiniMaxMiniMax-M2.7

MiniMax-M2.7 Open-Source Release Delayed to This Weekend

MiniMax delays M2.7 open-source release due to infrastructure work, now targeting this weekend.

Apr 61 min read
llama.cppQwen

37 LLMs Benchmarked on MacBook Air M5 32GB: Full Speed Results

Community benchmark of 37 local LLMs on M5 Air 32GB using llama-bench reveals MoE models as clear winners for speed-to-quality ratio.

Apr 62 min read
Qwen3.5Gemma4

Qwen3.5 vs Gemma4 vs Cloud LLMs: Python Turtle Drawing Benchmark

A Reddit user benchmarks local and cloud LLMs on Python turtle graphics, revealing Gemma4 and Gemini share visual style.

Apr 62 min read