Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They state in their report that they filter evaluation data off their training data, see p.3 - Filtering:

"Further, we filter all evaluation sets from our pre-training data mixture, run targeted contamination analyses to check against evaluation set leakage, and reduce the risk of recitation by minimizing proliferation of sensitive outputs."



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: