Matrix Factorization
The latent factor model that dominated Netflix Prize
The most powerful approach during the 2006-2009 Netflix Prize. Variants include SVD, ALS, and BPR, but the core idea is the same.
Decompose user-item matrix R(mΓn) into user matrix P(mΓk) and item matrix Q(nΓk). k is the number of latent factors. With k=50, each user and item gets a 50-dimensional vector.
Why it works
It automatically discovers hidden factors like "degree of action preference" or "romance affinity." No explicit genre labels needed β it finds patterns from data.
This is essentially embedding learning. You could call it embeddings before Word2Vec existed.
How It Works
Build user-item matrix R (sparse)
Initialize user matrix P and item matrix Q randomly
Optimize via SGD or ALS so R β P Γ Qα΅
Predict empty cells via dot product P_u Β· Q_i
Pros
- ✓ Learns latent factors well even from sparse data
- ✓ Relatively easy to interpret results
Cons
- ✗ Hard to incorporate side information about users/items
- ✗ May require full retraining for new data