The criterion a machine learning model optimizes to learn from data.
A training objective is the mathematical criterion that guides how a machine learning model adjusts its parameters during training. It is typically expressed as a loss function or reward signal that the learning algorithm seeks to minimize or maximize over the training data. The choice of training objective is one of the most consequential design decisions in building any ML system, because it defines what the model is fundamentally trying to accomplish — and a mismatch between the objective and the intended task can lead to poor generalization, unexpected behaviors, or outright failure.
In practice, training objectives take many forms depending on the task. Supervised learning commonly uses cross-entropy loss for classification and mean squared error for regression. Generative models may optimize a variational lower bound, as in variational autoencoders, or use adversarial objectives, as in GANs. Reinforcement learning agents optimize cumulative expected reward. Large language models are often trained with a next-token prediction objective — a form of self-supervised learning — which has proven surprisingly powerful for capturing broad linguistic and world knowledge.
The mechanics of optimizing a training objective typically involve computing gradients of the loss with respect to model parameters and applying an update rule such as stochastic gradient descent or one of its adaptive variants. The geometry of the loss landscape — whether it is convex, smooth, or riddled with local minima — directly affects how reliably and efficiently optimization proceeds. Regularization terms are often added to the objective to discourage overfitting, effectively encoding prior beliefs about what good solutions look like.
Training objectives have grown increasingly sophisticated as models have scaled. Researchers now routinely combine multiple objectives — for instance, pairing a primary task loss with auxiliary losses for representation learning or alignment. The field of reward modeling and reinforcement learning from human feedback (RLHF) represents a frontier where the objective itself must be learned from human preferences rather than specified analytically. Getting the training objective right remains central to progress across virtually every area of machine learning.