Hi! This is such an exciting release. Congratulations!
I work on Ollama and used the provided GGUF files to quantize the model. As mentioned by a few people here, the 4-bit integer quantized models (which Ollama defaults to) seem to have strange output with non-existent words and funny use of whitespace.
Do you have a link /reference as to how the models were converted to GGUF format? And is it expected that quantizing the models might cause this issue?
I work on Ollama and used the provided GGUF files to quantize the model. As mentioned by a few people here, the 4-bit integer quantized models (which Ollama defaults to) seem to have strange output with non-existent words and funny use of whitespace.
Do you have a link /reference as to how the models were converted to GGUF format? And is it expected that quantizing the models might cause this issue?
Thanks so much!