MetaProgramBench
Meta ProgramBench: AI Still Can't Build Large Programs from Scratch
Meta ProgramBench tests AI building programs from scratch. Top models failed, cooling 'AI builds software' hype and exposing benchmark score inflation
2h ago·2 min read