Assumptions built into a model that guide how it generalizes from training data.
An inductive prior is the set of assumptions, biases, or constraints embedded in a machine learning model that shape how it generalizes from observed training data to new, unseen examples. Without such priors, a model would have no principled basis for choosing among the infinitely many hypotheses consistent with a finite dataset — a problem formalized as the "no free lunch" theorem. Inductive priors effectively encode what the model considers plausible before seeing any data, steering the learning process toward solutions that are more likely to be correct given background knowledge about the problem domain.
Inductive priors can be either explicit or implicit. Explicit priors appear in Bayesian frameworks as probability distributions over model parameters — for example, placing a Gaussian prior on weights to encourage small values, which corresponds mathematically to L2 regularization. Implicit priors are baked into architectural and algorithmic choices: convolutional neural networks encode a prior that useful features are spatially local and translation-invariant, while recurrent networks assume sequential dependencies matter. Even the choice of optimizer or learning rate schedule subtly encodes assumptions about the loss landscape and solution structure.
The practical importance of inductive priors is enormous. A well-chosen prior that matches the true structure of a problem can dramatically reduce the amount of training data needed, improve generalization, and prevent overfitting. Conversely, a mismatched prior can systematically bias a model toward wrong solutions regardless of how much data is available. This is why domain knowledge is so valuable in machine learning — it allows practitioners to design architectures, regularizers, and training procedures that encode realistic assumptions about the task at hand.
In modern deep learning, the study of inductive priors has become increasingly sophisticated. Researchers analyze what biases different architectures implicitly impose, and work to design models whose priors align with the structure of real-world data — such as symmetry, compositionality, or smoothness. Transfer learning and meta-learning can also be understood through this lens: pretraining instills a prior over representations that makes downstream learning faster and more data-efficient.