Xiaomi's MiMo-V2.5-Pro achieved a cost of $0.99 per game in a complex social reasoning test, only one-third of Kimi K2.6's cost — LLM competition is shifting from "who is smarter" to "who is cheaper and good enough."

What this is

Recently, developers tested major LLMs in "Blood on the Clocktower," a highly complex social deduction game similar to "Werewolf." Xiaomi's MiMo-V2.5-Pro performed outstandingly, ranking in the top tier alongside Kimi K2.6, but with entirely different logic. While Kimi reasons meticulously, it consumes an average of 580,000 tokens (the smallest unit of text a model processes) per game, taking 10-15 hours and costing $2.65. MiMo, however, consumes only about 180,000 tokens per game, finishing in 2-3 hours, dropping the cost to $0.99, with a tool-calling (the model's operation of requesting external functions to execute tasks) error rate of only 0.4%. We note that MiMo is emerging as the most practical choice among high-end models by being "smart enough and highly cost-efficient."

Industry view

We believe this test exposes a real pain point in LLM deployment: reasoning redundancy. Kimi's approach of piling compute for ultimate accuracy is not cost-effective in commercial scenarios; verbose outputs and long response times will directly deter enterprise users. MiMo, while ensuring core reasoning capabilities, compresses cost and response time into a usable range, and its 0.4% error rate indicates excellent stability. However, it's worth noting that MiMo's win rate is severely imbalanced—an 88% win rate when playing the good faction, but only 48% for the evil faction. This exposes its shortcomings in strategic flexibility: when camouflage or non-logical maneuvering is required, it appears too "honest" and clumsy. At the same time, game testing metrics cannot be fully equated with reliability in enterprise production environments.

Impact on regular people

For enterprise IT: As the reasoning cost for a single complex task drops below $1, the scaled deployment of Agents (AIs capable of independently executing multi-step tasks) in long-process businesses becomes financially viable.

For the workplace: LLMs can now handle complex multi-person social reasoning and maneuvering. Negotiation and coordination roles that rely on information asymmetry and complex communication will face new pressures for automated efficiency.

For the consumer market: Hardware manufacturers like Xiaomi are leveraging low-cost models to penetrate the edge ecosystem. In the future, consumers may directly access cheap and ultra-fast reasoning services on their phones and vehicles.