Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is the idea behind non-negative matrix formulation (NMF). As the name implies, it forces the entries of the embedding matrices (for both the reduced document and term matrix) to be nonnegative, which results in a more interpretable “sum of parts” representation. You can really see the difference (compared to LSA/SVD/PCA, which does not have this constraint) when it’s applied to images of faces. Also, NMF has been shown to be equivalent to word2vec. The classic paper is here: http://www.cs.columbia.edu/~blei/fogm/2019F/readings/LeeSeun...

PS—There should be a negative sign on the (2,2) entry of the first matrix.



> non-negative matrix formulation (NMF)

*factorization ;)

Also PCA follows a similar idea as well (I mean, rotating vectors), but it's usually done is a much lower dimensional space


Ugh, that one was auto-correct, I swear. I have no idea what’s going on at Apple’s NLP department.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: