Large Embedding Models

While Large Language Models (LLM) gain more and more traction, for recommendation systems, the current trending is still Large Embedding Model!

The term actually refers to one (or more) layer within a Deep Recommendation model whose primary responsibility is to map high-cardinality, sparse categorical input features (e.g., Ad IDs, domain, search query terms, feature hashes, LLM-generated artifacts) into low-dimensional, dense numerical vectors (embedding). This is typically implemented using massive embedding lookup tables, where each unique category value corresponds to a unique embedding vector. The term “Large” highlights the memory footprint and parameter count of these embedding tables, often dominating the overall model size due to the millions or billions of unique IDs/categories involved. These embedding vectors are trained jointly with the rest of the network to optimize the final recommendation task.

References:

On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models
Monolith: Real Time Recommendation System With Collisionless Embedding Table
Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems

Nghia's Memory Palace

Explorer

Large Embedding Models

References:

Graph View

Backlinks