Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I read the entire thing fwiw (pseudo-retired life helps with time here).

It looks like it was a collaborative effort across multiple teams, where each team (research, security, psycology, etc etc etc) were all submitting ~10 pages or so. It doesn't feel like slop.



Did anything stand out across those 244 pages? Perhaps you have some of your take away thoughts written up somewhere?


Sorry very late reply to this, but ya. I posted here: https://x.com/pwnies/status/2041658034087457236

I'll copy the highlights here, but the tweets have imagery as well:

> The obvious hype - It crushes benchmarks across the board, and it does so with fewer tokens per task.

> Despite this, they don’t think it can self-improve on its own. There are still areas your average engineer does better with, and despite it accelerating tasks by 4x, that only translates to <2x increase in overall progress.

> They’re probably right to hold this back - its ability to exploit things is unprecedented. Any site running on an old stack right now or any traditional industry with outdated software should be terrified if this becomes accessible.

> Counterintuitively, while it’s the most dangerous model, it’s also the safest. They’ve also seen significant additional improvements in safety between their early versions of Mythos and the preview version.

> Anthropic does a really good job of documenting some of the rare dangerous behaviors the early models had. > Interestingly, Mythos itself leaked a recent internal “code related artifact” on github.

> Mythos is also RUTHLESS in Vending Bench. Agent-as-a-CEO might be viable?

> The last thing: Mythos has emergent humor. One of the first models I’ve seen that’s witty. The examples are puns it came up with and witty slack responses it had when operating as a bot.


AI writing has stopped feeling like slop around Opus 4.5, though.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: