A developer's weekend project delivered results: Qwen3.6-Solidity-27B surpassed Claude Opus on the soleval benchmark's pass@1 (single-generation accuracy). A vertically fine-tuned small model matching top-tier general-purpose models on specific tasks — this signal deserves our attention.

What this is

Solidity is the programming language for Ethereum smart contracts — a sizable but relatively niche ecosystem. The developer fine-tuned Qwen (Alibaba's open-source large language model) with Solidity-related data, producing a 27B (27 billion parameter) specialized model. On soleval, a Solidity programming evaluation suite, its single-pass accuracy exceeded Claude Opus — Anthropic's current strongest general-purpose model. The model is now open-sourced on HuggingFace.

Industry view

We're hearing two perspectives. Optimists see this as the right path: scaling general-purpose model parameters delivers diminishing returns, while vertical scenarios using small base models + high-quality domain data for fine-tuning offers controllable costs and predictable results — the pragmatic choice for most enterprises. Skeptics counter equally clearly: benchmarks don't equal real development environments; soleval's task coverage is limited; a specialized model beating a general model on its home turf isn't surprising, and in cross-task engineering scenarios the gap remains obvious. More importantly, fine-tuning data quality and acquisition cost are the real barriers — not every domain can find sufficient high-quality code data like Solidity can.

Impact on regular people

For enterprise IT: Smart contract auditing and development now have a lighter-weight model option, with deployment costs far below calling top-tier APIs — suitable for finance scenarios sensitive to data privacy. For individual careers: Web3 developers gain another open-source, locally-runnable code assistant option, reducing dependence on commercial APIs; but this also means the barrier to "using AI tools" is lowering, and competitive advantage must shift from tool usage to business understanding. For the consumer market: Short-term impact is limited — this model targets developers, not end users; but the "small model + vertical data" path it represents could replicate to legal, medical, and other fields closer to the general public in the future.