Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No you are probably overestimating the cost by 1-2 orders of magnitude. GPT-3 probably cost under $5 million, and this model is smaller and there have been algorithmic improvements to training transformers since then.


In February 2018, OpenAI signed a two-year compute contract with Google that had a $63M minimum spend.

See last page of their most recent available audited financials. https://rct.doj.ca.gov/Verification/Web/Download.aspx?saveas...


So do they estimate how much computing power/time they will need and then find some upper tier minimum $ amount to get the maximum discount possible or getting a certain resource availability commitment from Google? That's an interesting accounting problem.


> No you are probably overestimating the cost by 1-2 orders of magnitude.

You are right! Wow. Thank you for correcting me.

> GPT-3 probably cost under $5 million,

Is that one training run or includes all the fiddling to find the right hyperparameters? Or there aren't many of those in these training or they are not that sensitive?


I think they probably did a lot of hyperparameter searching to train the smaller models and then extrapolated for the largest model, but I'm just guessing. OpenAI had a finite amount of money when they were training GPT-3, they likely do it differently now that inference costs are significant compared to training costs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: