You might want to have a look at the Morse paper. It describes an algorithm which worked well, significantly better than KNN, at least on movies, using, by today's standards, a small data set. Unfortunately, when I inquired about open sourcing the code, I was told that BT had lost it. But the algorithm is described in detail in the paper.
Looks good, I'll keep an eye on this as development goes on. One thing I'm particularly curious about: the project page emphasizes optimization through SIMD instructions and multi-threading. Are there any benchmarks versus other recommendation engines to demonstrate what type of performance improvement one should expect?