Candidate Generation

Candidate generation is the retrieval stage in a recommendation or ranking system that quickly narrows a very large pool of possible items into a much smaller set worth scoring in detail. If a retailer has hundreds of thousands of products, or a social platform has millions of posts, the system usually cannot run its most expensive ranking logic over everything at once. Candidate generation solves that problem by finding a first-pass shortlist.

How It Works

A candidate-generation layer may use collaborative filtering, nearest-neighbor lookup, embeddings, vector search, item-to-item similarity, popularity, or lightweight rules to retrieve plausible items. The shortlist then moves into later stages where more expensive ranking, filtering, diversification, and policy logic can be applied.

Why It Matters

Candidate generation matters because modern AI systems often have too many possible items to compare directly in one step. In recommendation engines, it determines which products or pieces of content even get a chance to be ranked. In search systems, it determines what enters the result set before reranking. A weak retrieval stage can silently cap the quality of the entire system because the best item may never make it into the shortlist.

Where You See It

Candidate generation is common in e-commerce recommendation engines, social feeds, video recommendations, search systems, and retrieval pipelines that later hand items into ranking or generation. It is closely related to recommender systems, feed ranking, and vector search, but it is only one stage of those larger systems.