Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Only 8K context as well, like Mistral.

Also, as always, take these benchmarks with a huge grain of salt. Even base model releases are frequently (seemingly) contaminated these days.



Mistral Instruct v0.2 is 32K.


Mixtral (8x7b) is 32k.

Mistral 7b instruct 0.2 is just a fine tune of Mistral 7b.


original Mistral or GGUF one?


Agree: will be interesting how Gemma does on ChatBot Arena


They state in their report that they filter evaluation data off their training data, see p.3 - Filtering:

"Further, we filter all evaluation sets from our pre-training data mixture, run targeted contamination analyses to check against evaluation set leakage, and reduce the risk of recitation by minimizing proliferation of sensitive outputs."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: