Pre-training configuration settings that govern how a machine learning model learns.
Hyperparameters are configuration values set before training begins that control the structure and behavior of a machine learning model, as opposed to parameters, which are learned directly from data. Common examples include the learning rate in gradient descent, the number of layers and neurons in a neural network, the depth of a decision tree, regularization coefficients, batch size, and dropout rates. Because they are not updated during training, hyperparameters must be chosen deliberately — and their values can dramatically affect whether a model converges, overfits, underfits, or generalizes well to new data.
The process of finding good hyperparameter values is called hyperparameter optimization or tuning. The simplest approach, grid search, exhaustively evaluates all combinations of candidate values across a predefined grid. Random search, which samples combinations stochastically, often finds competitive results more efficiently. More sophisticated methods include Bayesian optimization, which builds a probabilistic surrogate model of the objective function to intelligently select the next configuration to evaluate, and gradient-based methods that treat hyperparameter selection as a differentiable problem. Automated machine learning (AutoML) frameworks increasingly handle this process end-to-end.
Hyperparameters matter because no single configuration works universally across datasets and tasks. A learning rate that trains one network efficiently may cause another to diverge or stall. Regularization strength must be calibrated to the complexity of the data and model to avoid overfitting or underfitting. As models have grown in scale and complexity — particularly with deep neural networks — the hyperparameter search space has expanded dramatically, making principled tuning strategies essential rather than optional.
The concept has been present in machine learning since early work on neural networks and statistical learning in the 1980s, but systematic hyperparameter optimization became a recognized research area in its own right as deep learning scaled up in the 2000s and 2010s. Today, hyperparameter tuning is a standard step in any serious model development pipeline, and tools like Optuna, Ray Tune, and Keras Tuner have made automated optimization accessible to practitioners across the field.