Only 8K context as well, like Mistral. Also, as always, take these benchmarks wi...

DreamGen · on Feb 21, 2024

Mistral Instruct v0.2 is 32K.

tarruda · on Feb 22, 2024

Mixtral (8x7b) is 32k.

Mistral 7b instruct 0.2 is just a fine tune of Mistral 7b.

netdur · on Feb 22, 2024

original Mistral or GGUF one?

tosh · on Feb 21, 2024

Agree: will be interesting how Gemma does on ChatBot Arena

Kydlaw · on Feb 22, 2024

They state in their report that they filter evaluation data off their training data, see p.3 - Filtering:

"Further, we filter all evaluation sets from our pre-training data mixture, run targeted contamination analyses to check against evaluation set leakage, and reduce the risk of recitation by minimizing proliferation of sensitive outputs."