Encode 20.5 tok/s and decode 6.6 tok/s on my 14 Pro (non-Max)
Did pretty good on an okonomiyaki recipe
Edit: the coding capabilities are pretty hilarious. I asked about partials in Turbo streams and it correctly answered, but then when I asked about a code sample it gave me a PHP+MySQL query? Who knows what happened there.
13 Pro Max, no crashes but definitely went OOM and slowed to a crawl. My phone also got very warm.
I wasn't able to get much useful output for the things I normally use ChatGPT to help with. It insisted on giving general steps instead of code for basic tasks I threw at it. (Certificate generation using openssl or python).
Fantastic effort, I'm on holidays but I'm going to burn through a significant chunk of my mobile internet traffic allotment to test this... I seriously thought it'd take at least until summer for someone to make this happen (LLMs on mobile devices)
Edit: too bad, it crashes all the time without generating output.