That reaction has happened with every model release for the past few years. Mayb...

nicoburns · 2026-03-17T10:53:30 1773744810

A large proportion of my professional network were in the "AI for code generatin might just be a fad" camp pre Opus 4.5 (and the Codex/Gemini models that came out shortly after that), and now almost everyone seems to think that AI will have at least some place in professional development environments on an ongoing basis.

I've recently given it a go myself, and it certainly doesn't get it right all the time. But I was able to generate AI-assisted code that met my quality standards at roughly the same speed as coding it by hand.

staticassertion · 2026-03-17T11:55:29 1773748529

FWIW I am definitely someone who uses AI. I have been using it for a few years now. There's no question that models have improved. I'd say the biggest leap was around the ChatGPT 3.5 -> 4.0, which radically reduced hallucination problems. The big issue of "it just made up a module that doesn't exist" more or less went away at that point. This was the big leap from "spits out text that might help you" to "can produce value".

Since then it has been incremental. I would say the big win has been that models degrade more slowly as context grows. This means, especially for heavily vibecoded-from-scratch projects, that you hit the "I don't even know wtf this is anymore" wall way later, maybe never if you're steering things properly.

I think because you can avoid hitting that wall for longer, people see this as radically different. It's debatable whether that's true or not. But in terms of just what the model does, like how it responds to prompts, I genuinely think it is only marginally better. And again, I think benchmarks confirm this, and I quite like Fodor's analysis on benchmarking here[0].

I use these models daily and I try new models out. I think that people over emphasize "model did something different" or "it got it right" when they switch over to a new model as "this is radically better", which I believe is simply a result of cognitive bias / poor measurement.

[0] https://jamesfodor.com/2025/06/22/line-goes-up-large-languag...