Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hi! This is such an exciting release. Congratulations!

I work on Ollama and used the provided GGUF files to quantize the model. As mentioned by a few people here, the 4-bit integer quantized models (which Ollama defaults to) seem to have strange output with non-existent words and funny use of whitespace.

Do you have a link /reference as to how the models were converted to GGUF format? And is it expected that quantizing the models might cause this issue?

Thanks so much!



As a data point, using the Huggingface Transformers 4-bit quantization yields reasonable results: https://twitter.com/espadrine/status/1760355758309298421




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: