Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interesting take. Have you benchmarked models on your own data? Cause at this point everything is contaminated so I find it impossible to tell what proper sota is. Also - most folks still just use openai. Last time I checked, reranking always performs better than pure vector search. And to my knowledge it's still the superior fusion method for keyword and vector results.


In my experience, storing RAG chunks with a little bit of context helps a lot when doing the retrieval, then you can skip the whole "rerank" bit and halve your cost and latency.

With embedding/generative models becoming better with time, the need for a rerank step will be optimized away.


Huh? Rerank is always a boost on top of retrieval. So regardless of the chunking method or model you use, reranking with a good model will always result in higher MRR. And improvements in embedding models also will never solve the problem of merging lexical and vector search results. Rank/score fusion are flawed since both are hardly comparable and boosting only works sometimes. Whereas rerankers generally do a pretty good job at this. Performance is indeed the biggest issue here. Rerankers are slow as hell and simply not feasible for some use cases.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: