The desired outcome or objective that directs an AI system's behavior.
In artificial intelligence, a goal is the target state or outcome that an AI system is designed to pursue. Goals serve as the organizing principle behind an agent's decision-making, determining which actions are worth taking and which states of the world are considered desirable. They can range from narrow and well-defined objectives — such as minimizing prediction error on a dataset — to broad and open-ended aims like winning a game, navigating an environment, or satisfying a user's preferences. The specificity and structure of a goal profoundly shape how an AI system is built and how it behaves.
Different AI paradigms encode goals in different ways. In classical planning and search, goals are explicit logical conditions that the system works to satisfy by selecting sequences of actions. In reinforcement learning, goals are expressed implicitly through a reward function: the agent learns a policy that maximizes cumulative reward over time, with the reward signal encoding what the designer considers success. In supervised learning, the goal is typically framed as minimizing a loss function that measures discrepancy between predictions and ground truth. Each formulation carries its own assumptions about what the system should optimize and how progress is measured.
The alignment between a specified goal and the intended outcome is one of the central challenges in AI development. A system that optimizes a proxy goal too literally may achieve high scores on the metric while violating the spirit of the objective — a phenomenon known as reward hacking or specification gaming. This gap between what designers write down and what they actually want has motivated significant research into reward modeling, inverse reinforcement learning, and AI alignment more broadly. As AI systems become more capable, ensuring that their goals faithfully represent human values and intentions becomes increasingly critical.
Goal representation also intersects with questions of generalization and robustness. A goal that works well in a training environment may lead to unexpected or harmful behavior when the system encounters novel situations. Understanding how goals interact with an agent's capabilities, environment, and learning dynamics remains a foundational concern across virtually every subfield of AI research.