πŸ—ΊοΈ

Embeddings in Recommendation Systems

Mapping users, items, and context into one vector space

The most fundamental question in RecSys is "will this user like this item?" Embeddings turn this into a distance problem in vector space.

If user and item vectors are close, the user likely prefers it. This is the core of Two-Tower models, and Matrix Factorization is essentially the same idea.

Types of embeddings

ID embeddings: Learnable vectors for user IDs and item IDs. The most basic but weak against cold start.

Feature embeddings: Vectorize attributes like category, tags, price range. The Deep part of Wide&Deep does this.

Sequence embeddings: Encode entire behavior sequences into one vector. GRU4Rec, BERT4Rec fall here.

Context embeddings: Situational info like time, location, device. The same user wants different things on a commute vs. a weekend.

How It Works

1

Define features for User/Item/Context

2

Vectorize each feature via Embedding Layer

3

Combine vectors (concat/attention) into a unified representation

4

Compute matching score via dot product or cosine similarity

Pros

  • Unifies heterogeneous data (text, image, behavior) in one space
  • Millisecond serving via ANN index

Cons

  • Requires hyperparameter tuning (dimensions, learning rate)
  • Embedding drift over time β€” periodic retraining required

Use Cases

User/item towers in Two-Tower models ANN (Approximate Nearest Neighbor) search