A model that learns and predicts individual preferences from observed behavior and choices.
A preference model is a computational system designed to quantify, represent, and predict what individuals are likely to prefer based on observed signals such as ratings, clicks, purchases, or explicit choices. Rather than relying on hand-crafted rules, modern preference models learn latent structure from large datasets, capturing the underlying factors that drive human decision-making. Techniques range from classical collaborative filtering and matrix factorization to deep learning architectures that embed users and items into shared representation spaces where proximity reflects affinity.
At their core, preference models solve an inference problem: given partial observations of a user's behavior, estimate their preferences over a broader space of options. Matrix factorization methods decompose a sparse user-item interaction matrix into low-dimensional latent factors, while neural approaches can incorporate rich side information such as content features, context, and sequential behavior. More recent work frames preference learning within reinforcement learning from human feedback (RLHF), where a reward model is trained to predict which outputs a human would prefer — a formulation now central to aligning large language models with human values.
Preference models matter because they sit at the intersection of personalization and decision support across nearly every consumer-facing domain. Recommendation systems in streaming, e-commerce, and social media rely on them to surface relevant content at scale. In the context of AI alignment, preference models serve a more fundamental role: they encode what humans actually want, providing a training signal that guides model behavior beyond simple task accuracy. The Netflix Prize competition (2006–2009) was a landmark moment that accelerated research into scalable preference modeling, but the field has since expanded well beyond recommendations.
A key challenge in preference modeling is the gap between revealed preferences — what behavior implies people want — and true preferences — what people actually value. Biases in data collection, feedback loops, and the difficulty of eliciting honest preferences all complicate model training. Addressing these issues is an active research area, particularly as preference models become load-bearing components in systems that shape both user experience and AI behavior.