Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> But on a tangent, why do you believe in mixture of experts

In a hardware inference approach you can do tens of thousands tokens per second and run your agents in a breadth first style. It is all very simply conceptually, and not more than a few years away.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: