RAG
找到 12 篇关于此标签的文章
火 山引擎发布 Milvus Serverless:实例创建仅需 3 秒
字节跳动旗下火山引擎推出 Milvus Serverless,实例创建仅需约 3 秒,支持按需计费与缩容至零,专 为 AI Agent 与 RAG 应用场景设计。
RAG 架构迁移:从自建 GPU 集群转向 API, 成本直降 97%
一 家中国 SaaS 企业将 AI 基础设施从四块 A100 迁移至 De epSeek API,月度成本从 8 万元骤降至不足 2000 元,降幅达 97.5%。
LangChain-Chroma 高并发架构:超越基础 RAG
详解如何解决 Chroma 从原型扩展到生产环境时的写入阻塞、查询延迟激增及 OOM 错误,提供四种具体优化方案。
Perplexity 发布 MIT 许可的嵌入模型以支持本地部署
Perplexity AI 在 MIT 许可下发布多款嵌入模型,支持免费商用及本地部署,降低对付费 API 的依赖。
LLM Cognitive Architecture: From Rule-Based to Autonomous Agents
A technical breakdown of four-layer LLM agent architecture with Python code using LangChain and LangGraph.
LCEL in Practice: Engineering 4 Core LangChain Pipeline Patterns
How to structure LangChain LCEL chains for linear, routing, RAG, and agent workflows in production.
LangChain Vector Embeddings: From Basics to RAG Implementation
Practical guide to LangChain embeddings using OpenAI, HuggingFace, and local models for semantic search and RAG pipelines.
RAG Demystified: Baseline vs. Advanced Retrieval Pipelines
Community clarifies RAG's true baseline: retrieve, rerank, inject chunks, generate — extras are enhancements.
RAG vs. Agentic Retrieval: What Actually Counts as RAG?
A LocalLLaMA thread debates whether RAG is a precise term or marketing hype for any retrieval-based LLM system.
Building a Maintainable Prompt Layer for Enterprise RAG Systems
Replace string-concatenated prompts with LangChain's ChatPromptTemplate and PipelinePromptTemplate for scalable enterprise knowledge bases.
LangChain Document Loading and Text Splitting for RAG Pipelines
How to load PDFs, Word, HTML, and Markdown files in LangChain and split them for RAG applications.
LangChain Runnable: The Interface That Makes AI Pipelines Maintainable
Why LangChain's Runnable protocol transforms scattered model calls into composable, maintainable AI workflows.