NVIDIA
12 articles tagged with this topic
Hugging Face Top 100 Hardware: Local AI Still Runs on Consumer GPUs
Hugging Face reveals top 100 hardware configs for local AI. Consumer GPUs dominate, exposing the true AI deployment barrier better than vendor specs.
NVIDIA Proposes Extreme Co-Design for Agents: Infrastructure Must Be Rebuilt
NVIDIA's Extreme Co-Design: Agent complexity breaks legacy architecture. Full-stack optimization isn't technical—it's a play for infrastructure domina
NVIDIA Puts AI Agents in Cars: Smart Cockpits Shift From Commands to Thinking
NVIDIA's cloud-to-car in-vehicle AI Agent upgrades cockpits from voice commands to proactive planning, but cost and safety certs remain bottlenecks.
RTX 5000 48GB Unleashes Qwen3.6: The Sweet Spot for Local High-Precision AI
A 48GB RTX 5000 runs Qwen3.6 27B at 200k context and 80 TPS without heavy compression. For ~50,000 RMB, deploy a full-strength local AI, dodging cloud
APEX Quantizes 25 Models: 10B-Param AI on Home GPUs Flattens Compute Barrier
APEX quantizes 25+ MoE models with new I-Nano tier. 10B-param AI now runs on single consumer GPUs, slashing local deployment costs.
GPU Agent Utilization at 30-40%: Purpose-Built Inference Chip Window Opens
YC finds GPU Agent utilization at only 30-40%. Purpose-built inference chips offer an opportunity, but ecosystem lock-in and evolving demand remain ha
Nvidia Uses AI Agent to Optimize Supply Chain — LLMs Start Replacing OR Experts
Nvidia launches cuOpt Agent skills, using LLMs to solve traditional operations research pain points of slow modeling and hard adjustment. AI Agents no
NVIDIA RTX A5000 Pro 48GB Arrives: Local LLMs No Longer Need Dual GPUs
NVIDIA's $4,500 RTX A5000 Pro 48GB runs quantized Qwen 27B on a single card. Simpler than dual-GPU setups for local AI, but value requires careful mat
Nvidia Lyra2: Single Photo to Infinite 3D World, Gen AI Takes Over Scene Infra
Nvidia's Lyra2 generates seamless 3D worlds from a single photo. Gen AI evolves from drawing images to building spaces, rewriting game and film asset
RTX 5080 Sparks Local Coding Debate: Consumer GPUs Start Taking Cloud AI's Jobs
r/LocalLLaMA debates RTX 5080+64GB RAM for quantized coding. Moving AI off-cloud turns consumer hardware into AI coding infrastructure managers must w
NVIDIA NVFP4 Puts 26B Model on Consumer GPU With Under 1% Accuracy Loss
NVIDIA's NVFP4 Gemma-4-26B shrinks to 18.8GB for consumer GPUs with <0.7% accuracy loss. 4-bit is now optimal, but also an ecosystem lock-in.
NVIDIA Open-Sources AITune: Auto-Selects Fastest PyTorch Inference Backend
NVIDIA releases AITune, an open-source toolkit that benchmarks and auto-picks the optimal inference backend for PyTorch models.