Compare | opcnew

RAG's Accuracy Flaw: Why Vector Databases Alone Fail Enterprise Knowledge Bases

Testing a 100,000-document repository revealed that the most basic vector search returned a completely irrelevant cloud-native article as its top result—we believe "how to search" determines the success or failure of RAG (Retrieval-Augmented Generation) projects far more than merely "having a database."

What this is

We note that many enterprises build knowledge bases, yet their AI still hallucinates. The reason is often not a dumb model, but an overly simplistic "data retrieval" strategy.

The most basic current approach is "similarity search" (fetching if the semantics match), but facing massive documents, it has three fatal flaws: duplicate results (5 articles explaining the exact same concept), low-quality mixing (forcing Go language concurrency onto someone asking about Python), and ignoring explicit conditions (the user asks for "2024," but it brings back 2022 results as well).

Consequently, the industry is shifting toward more complex retrieval strategy combinations: MMR (Maximal Marginal Relevance, balancing relevance and result diversity to avoid clustering), threshold filtering (setting a minimum relevance score, preferring fewer results over padding with filler), and Self-Query (having AI automatically decompose user queries, converting "2024" into a hard database filter condition). Moving from pure semantic comparison to balancing diversity and hard conditions is the necessary path for AI retrieval to evolve from "barely usable" to "highly effective."

Industry view

We judge that the competitive focus of RAG has shifted from "using a vector database" to the refinement of "retrieval engineering." Everyone realizes that without good strategies, more data is just noise.

However, this is not without controversy. Some architects point out that Self-Query relies heavily on the LLM's ability to decompose user intent; if the model misinterprets the conditions (e.g., misclassifying "Apple phone" as a fruit category), it can result in empty knowledge base queries. Furthermore, combining multiple retrieval strategies significantly increases system response latency. The sharp rise in engineering complexity is locking out small and mid-sized teams lacking technical depth.

Impact on regular people

For enterprise IT: The focus of knowledge base construction shifts from "hoarding data volume" to "optimizing the retrieval pipeline"; merely purchasing a vector database can no longer solve business pain points.

For the workplace: When querying AI, explicitly providing qualifiers like time and category better triggers Self-Query functionality, yielding precise answers.

For the consumer market: C-end AI products will no longer ramble or veer off-topic; users will noticeably feel that the materials AI provides emphasize diversity and accuracy.

AI 查不准资料是通病 — 光靠向量数据库的 RAG 已经不够用了

实测 10 万篇文档库发现，最基础的向量检索返回的第一条结果竟是完全跑题的云原生文章 — 我们认为，“怎么找”比“有没有库”更能决定 RAG（检索增强生成，让大模型先查资料再回答）项目的成败。

这是什么

我们注意到，很多企业搭了知识库，AI 却总在胡说八道。原因往往不是模型太笨，而是“找资料”的策略太简单。

目前最基础的找法是“相似度检索”（看词义像就拿来），但面对海量文档，它有三个硬伤：结果重复（5 篇文章都在讲同一个概念）、低质量混入（把 Go 语言的并发硬塞给问 Python 的人）、无视明确条件（用户问“2024 年”，它把 2022 年的也找来）。

为此，行业正在转向更复杂的检索策略组合：MMR（最大边际相关性，兼顾相关性和结果多样性，避免扎堆）、阈值过滤（设个及格线，宁可少给绝不凑数）、Self-Query（让 AI 自动拆解用户提问，把“2024 年”转化为数据库的硬性过滤条件）。从单纯比对字义，到兼顾多样性与硬条件，这是 AI 查资料从“能用”到“好用”的必经之路。

行业怎么看

我们判断，RAG 的竞争焦点已经从“用上向量数据库”转移到了“检索工程”的精细化上。大家都意识到，没有好策略，再多的数据也只是噪音。

但这并非没有争议。有架构师指出，Self-Query 强依赖大模型拆解用户意图的能力，如果模型拆错了条件（比如把“苹果手机”误判为水果类别），反而会导致整个知识库查空；此外，组合多种检索策略会显著增加系统的响应延迟。工程复杂度的急剧上升，正在把缺乏技术深度的中小团队挡在门外。

对普通人的影响

对企业 IT：知识库建设重心从“堆数据量”转向“调优检索链路”，单纯采购向量数据库已无法解决业务痛点。

对个人职场：向 AI 提问时，明确提出时间、类别等限定词，能更好地触发 Self-Query 功能，获得精准回答。

对消费市场：C 端 AI 产品的回答将不再东拉西扯或偏题，用户会明显感受到 AI 给的资料更讲究多样性和准确性。