> A study from METR found that when developers used AI tools, they estimated tha...

jph00 · 2026-02-14T23:38:06 1771112286

> I don’t know how to trust the author if stuff like this is wrong.

She's not wrong.

A good way to do this calculation is with the log-ratio, a centered measure of proportional difference. It's symmetric, and widely used in economics and statistics for exactly this reason. I.e:

ln⁡(1.2/0.81) = ln⁡(1.2)-ln⁡(0.81) ≈ 0.393

That's nearly 40%, as the post says.

nkmnz · 2026-02-15T08:58:34 1771145914

so if the numbers were “99% slower than without AI but they thought they would be 99% fast”, you’d call that “they were 529% slower”, even though it doesn’t make sense to be more than 100% slower? And you’d not only expect everyone to understand that, but you really think it’s more likely a random person on the internet used a logarithmic scale than they just did bad math?

ozozozd · 2026-02-16T09:09:14 1771232954

Well, this random person we are referring to happens to have a PhD in math from Duke.

I find that satisfying.

piker · 2026-02-14T23:01:53 1771110113

I get caught up personally in this math as well. Is a charitable interpretation of the throwaway line that they were off by that many “percentage points”?

nkmnz · 2026-02-14T23:18:40 1771111120

That would be correct, but also useless. It matters if 50pp are 50% vs. 100%, 75% vs. 125% or 100% vs. 150%.

regular_trash · 2026-02-14T23:01:35 1771110095

Can you elaborate? This seems like a simple mistake if they are incorrect, I'm not sure where 33% or 50% come from here.

nkmnz · 2026-02-14T23:13:55 1771110835

Their math is 120%-80%=40% while the correct math is (80-120)/120=-33% or (120-80)/80=+50%

It’s more obvious if you take more extreme numbers, say: they estimated to take 99% less time with AI, but it took 99% more time - the difference is not 198%, but 19900%. Suddenly you’re off by two orders of magnitude.

jph00 · 2026-02-14T23:41:10 1771112470

It's not a mistake. It's correct, and is a excellent way to present this information.

softwaredoug · 2026-02-14T23:03:12 1771110192

Isn't the study a year old by now? Things have evolved very quickly in the last few months.

jascha_eng · 2026-02-15T05:21:44 1771132904

Yes and if was done with people using cursor at the time and already had a few caveats back then about who was actually experienced with the tool etc.

Still an interesting observation. It was also on brown field open source projects which imo explains a bit why people building new stuff have vastly different experiences than this.

legulere · 2026-02-15T08:12:27 1771143147

The exact numbers certainly would be different today, but you would probably still see the effect that there’s an overestimation of productivity

nkmnz · 2026-02-14T23:20:43 1771111243

Yes. No agents, no deep research, no tools, and just Sonnet-3.5 and 3.7 - I’d love to see the same study today with Opus-4.6 and Codex-5.3

slopinthebag · 2026-02-15T04:41:21 1771130481

Probably 38% slower now...

nkmnz · 2026-02-15T09:10:04 1771146604

Please don’t project. :)