RAG
12 articles tagged with this topic
Volcengine Launches Milvus Serverless With 3-Second Instance Creation
ByteDance's Volcengine ships Milvus Serverless with sub-3 -second provisioning, scale-to-zero billing for AI Agent workloads.
RAG Migration From Self-Hosted to API Cuts Costs 97%
A Chinese SaaS firm cut monthly AI infra costs from ¥80,000 to under ¥2,000 by ditching 4x A100s for DeepSeek API.
LangChain-Chroma High-Concurrency Architecture: Beyond Basic RAG
How to fix write blocking, query latency spikes, and OOM errors when scaling Chroma from prototype to production.
Perplexity Releases MIT-Licensed Embedding Models for Local Use
Perplexity AI has published several embedding models under the MIT license, enabling free commercial use in local deployments.
LLM Cognitive Architecture: From Rule-Based to Autonomous Agents
A technical breakdown of four-layer LLM agent architecture with Python code using LangChain and LangGraph.
LCEL in Practice: Engineering 4 Core LangChain Pipeline Patterns
How to structure LangChain LCEL chains for linear, routing, RAG, and agent workflows in production.
LangChain Vector Embeddings: From Basics to RAG Implementation
Practical guide to LangChain embeddings using OpenAI, HuggingFace, and local models for semantic search and RAG pipelines.
RAG Demystified: Baseline vs. Advanced Retrieval Pipelines
Community clarifies RAG's true baseline: retrieve, rerank, inject chunks, generate — extras are enhancements.
RAG vs. Agentic Retrieval: What Actually Counts as RAG?
A LocalLLaMA thread debates whether RAG is a precise term or marketing hype for any retrieval-based LLM system.
Building a Maintainable Prompt Layer for Enterprise RAG Systems
Replace string-concatenated prompts with LangChain's ChatPromptTemplate and PipelinePromptTemplate for scalable enterprise knowledge bases.
LangChain Document Loading and Text Splitting for RAG Pipelines
How to load PDFs, Word, HTML, and Markdown files in LangChain and split them for RAG applications.
LangChain Runnable: The Interface That Makes AI Pipelines Maintainable
Why LangChain's Runnable protocol transforms scattered model calls into composable, maintainable AI workflows.