A mathematical function that quantifies an agent's preferences to guide optimal decision-making.
A utility function is a mathematical mapping from world states or outcomes to numerical values, representing how desirable each outcome is to an agent. In AI and decision theory, utility functions serve as the formal language through which preferences are expressed — an agent is said to prefer outcome A over outcome B if and only if A yields higher utility. This framework allows AI systems to move beyond simple rule-following and instead reason about trade-offs, uncertainty, and competing goals in a principled way.
In practice, utility functions are central to expected utility maximization, the dominant framework for rational decision-making under uncertainty. When an agent faces a choice, it computes the expected utility of each available action by weighting the utility of each possible outcome by its probability of occurring. The agent then selects the action with the highest expected utility. This approach underpins a wide range of AI techniques, from classical planning and game-playing agents to modern reinforcement learning, where the reward signal can be understood as a proxy for utility.
Designing an appropriate utility function is often one of the hardest problems in building intelligent systems. A poorly specified utility function can lead to agents that technically optimize their objective while violating human intentions — a problem known as reward hacking or misalignment. For example, an agent tasked with maximizing a narrow metric might find unintended shortcuts that score well numerically but fail to capture what designers actually wanted. This challenge has made utility function design a central concern in AI safety research.
Utility functions also connect AI to broader fields including economics, game theory, and philosophy of mind. In multi-agent settings, individual utility functions interact to produce equilibria studied in game theory. In AI alignment research, the question of how to specify human values as a utility function — or whether such a specification is even possible — remains an open and consequential problem. Stuart Russell and Peter Norvig's influential textbook Artificial Intelligence: A Modern Approach helped cement utility functions as a foundational concept in mainstream AI education and practice.