RAG
20 articles tagged with this topic
Self-Attention Powers AI Context — But Few Firms Truly Understand It
Self-attention is the core of mainstream AI, enabling simultaneous word relationship analysis. Understanding it is key to evaluating AI costs and ROI.
Stop Guessing RAG Quality: RAGAS Uses AI to Grade AI
RAG quality often relies on guesswork. RAGAS uses 4 metrics and LLM-as-Judge to turn gut feelings into engineering KPIs—vital for enterprise knowledge
RAG's Five Stages: LLMs Embrace Open-Book Exams as Enterprise Standard
RAG is the enterprise LLM standard, enabling "open-book exams" via knowledge retrieval. But accuracy, engineering complexity, and data cleaning remain
LLMs Keep Hallucinating: RAG Becomes the Enterprise Standard Config
RAG makes AI check external data before answering, fixing knowledge lag and hallucinations. It's core infrastructure, not a patch, for safe private da
RAG's Accuracy Flaw: Why Vector Databases Alone Fail Enterprise Knowledge Bases
RAG often misses the mark. Naive similarity search yields duplicates and ignores constraints. Retrieval strategy is the real watershed for viable know
HKU Open-Sources DeepTutor: AI Tutoring Deployment Barrier Drops Again
HKU open-sources DeepTutor with guided install and local RAG. AI tutoring deployment barriers drop, but hardware demands still block non-technical use
900K-Token RAG Test: Simplest Line Split Wins; Enterprise KBs Stop Overpaying
Most enterprise RAG projects fail at chunking. Latest 900K-token benchmark: simplest line splitting is most accurate. Chunking strategy > model choice
90% of Enterprise AI Knowledge Base Failures Lie in Retrieval, Not LLMs
When enterprise AI fails, most blame the LLM. The real bottleneck is retrieval. Vector similarity ≠ business relevance; optimizing retrieval is the cu
LangChain Breaks AI Into 4 Components: Orchestration Layer, Not Just Framework
LangChain splits AI into Chain, Agent, Memory, Tool. It's an orchestration layer shifting LLMs from "talking" to "doing"—crucial for anyone tracking A
Traditional DBs Fail at AI Semantics: Vector DB Selection Decides Knowledge Base Fate
Traditional DBs can't handle semantic search for AI. As RAG infrastructure, vector DB selection dictates enterprise knowledge base efficiency and long
RAG Architectures Split From 1 to 9: Production AI Ditches 'Good Enough'
9 RAG architectures signal enterprise AI's shift from answering to reliability. Wrong choices cause confident hallucinations and waste months.
40% RAG Retrieval Gap After Embedding Swap: The Semantic Engine is Everything
Embedding is RAG's semantic core. BGE beats OpenAI in Chinese. Model choice beats tuning, but benchmarks ≠ biz results, and over-optimizing is a resou
Archon Goes Viral: Ditch AI Free-Play, Deterministic Orchestration Is Endgame
Archon drops AI free-play for deterministic workflows. This "code does dirty work, AI thinks" hybrid is the sole fix for enterprise AI black-box chaos
LangChain Teaches AI to Take Notes: Memory Is Agent Deployment's Lifeline
LLMs are inherently amnesic. LangChain's two-layer memory scheme solves Agent amnesia, determining if AI apps evolve from toys into tools.
Document Chunking Dictates AI Quality: Get It Wrong, and the Best Model Fails
60% of RAG success hinges on document chunking. Four strategies range from crude to precise; costs match results. This is often the biggest enterprise
Deconstructing the LLM Lineage: From LLM to Agent, It's All Context Patching
From RAG to MCP, buzzwords overwhelm. We map the core logic: LLMs just predict text; later tech patches their gaps. Grasp this, and jargon won't fool
Building RAG in 30 Lines: AI Bottleneck Is Plumbing, Not Models
LangChain builds RAG in ~30 lines. Enterprise AI bottlenecks are the "plumbing," not models. Frameworks cut trial costs but obscure underlying details
Volcengine Launches Milvus Serverless With 3-Second Instance Creation
ByteDance's Volcengine ships Milvus Serverless with sub-3 -second provisioning, scale-to-zero billing for AI Agent workloads.
RAG Migration From Self-Hosted to API Cuts Costs 97%
A Chinese SaaS firm cut monthly AI infra costs from ¥80,000 to under ¥2,000 by ditching 4x A100s for DeepSeek API.
LangChain-Chroma High-Concurrency Architecture: Beyond Basic RAG
How to fix write blocking, query latency spikes, and OOM errors when scaling Chroma from prototype to production.