Compare | opcnew

LLMs Are Homogenizing Human Writing — The 'Delve' Spike Signals Real Risk

A 2024 lexical analysis of academic papers shows the usage rate of the word 'delve' doubled in two years — large language models are quietly rewriting human expression habits.

What this is

This study tracked vocabulary changes in English writing before and after the popularization of large language models. Words like 'delve,' 'tapestry,' 'landscape,' and 'moreover' have seen their frequency in academic papers and business reports deviate significantly from historical trends — and they happen to be ChatGPT's preferred expressions. As more people use AI-assisted writing, these 'AI-speak' terms flow back into human language, forming a self-reinforcing loop. RAG (Retrieval-Augmented Generation, the technique that has LLMs search external sources before answering) makes the problem worse — what it retrieves first is often content AI has already generated.

Industry view

Linguists and AI researchers are sharply divided. Supporters argue that large models have lowered the writing barrier for non-native speakers, and a certain convergence in word choice is a reasonable price for technological democratization. But critics point out that homogenization is eroding language's value as both identity marker and thinking tool. More research warns: once AI-generated text exceeds a threshold share, the data used to train the next generation of models will be polluted by their own output — known as 'Model Collapse.' The practical risk is equally glaring: companies invest heavily in brand voice, yet what employees produce with AI reads exactly like their competitors'.

Impact on regular people

For enterprise IT: content moderation must evolve from 'detect AI-generated' to 'detect language homogenization' — brand compliance now has a new dimension. For individual careers: writing text that 'doesn't sound like AI' is becoming a scarce skill; 'AI-speak' in résumés and reports can now be a liability. For consumer markets: user patience for cookie-cutter marketing copy is declining, and 'authentic human' content is starting to command a premium.

大模型正在让人类写作趋同 — 从 'delve' 用量异常看语言同质化风险

2024 年一项针对学术论文的词汇分析显示，'delve' 一词使用率在两年内翻倍增长 — 大模型已经在悄悄改写人类的表达习惯。

这是什么

这项研究追踪了大模型普及前后英语写作的词汇变化。'delve''tapestry''landscape''moreover' 等词在学术论文、商业报告中出现频率显著偏离历史趋势，而它们恰好是 ChatGPT 偏好的表达。当越来越多人用 AI 辅助写作，这些'AI腔'词汇回流到人类语言中，形成自我强化的循环。RAG（检索增强生成，让大模型先搜外部资料再回答的技术）会让问题更严重 — 它优先检索到的，往往是 AI 已生成的内容。

行业怎么看

语言学界和 AI 研究者分歧明显。支持方认为，大模型降低了非母语者写作门槛，某种用词趋同是技术民主化的合理代价。但批评者指出，同质化正在侵蚀语言作为身份认同和思维工具的价值。更有研究警告：当 AI 文本占比超过阈值，训练下一代模型的数据就会被自己产出的内容污染 — 即'模型崩溃'（Model Collapse）。实际风险同样刺眼：企业花大力气做品牌调性，员工用 AI 写出来的东西却和竞争对手一模一样。

对普通人的影响

对企业 IT：内容审核需从'检测 AI 生成'升级到'检测语言同质化'，品牌合规有了新维度。对个人职场：写出'不像 AI'的文字正成为稀缺能力，简历和报告中的'AI腔'可能成为减分项。对消费市场：用户对千篇一律的营销文案耐心正在下降，'真人感'内容开始获得溢价。