A paradigm where algorithms learn patterns from data rather than explicit programming.
A structured collection of data used to train, validate, and evaluate machine learning models.
The mathematical foundation of vectors and matrices underlying nearly all machine learning.
A binary logic system representing true or false values, foundational to computation.
A finite sequence of instructions that solves a problem or performs a computation.
Using learned patterns from data to estimate unknown or future outcomes.
Finding the best solution from all feasible options given an objective and constraints.
The probability of an event occurring given that another event has already occurred.
The iterative process of optimizing a model's parameters using data.
A function that determines whether a neuron fires, introducing non-linearity into a neural network.
A layered system of interconnected nodes that learns patterns from data.
A core algebraic operation that multiplies two matrices to produce a third.
A machine learning approach using multi-layered neural networks to model complex data patterns.
Layered computational models that learn from data by adjusting weighted connections.
A model-internal variable whose value is learned directly from training data.
The subfield of AI enabling computers to understand, process, and generate human language.
Systematic examination of datasets to extract patterns, insights, and actionable knowledge.
Human language that evolved organically, as opposed to formally constructed artificial languages.
The algorithm that trains neural networks by propagating error gradients backward through layers.
The labeled examples used to teach a machine learning model.
The desired outcome or objective that directs an AI system's behavior.
A model's ability to perform accurately on new, previously unseen data.
A mathematical measure of error that guides model training toward better predictions.
A supervised learning approach that predicts continuous numerical outcomes from input variables.
A supervised learning task that assigns input data to predefined discrete categories.
An iterative optimization algorithm that minimizes a function by following its steepest downhill direction.
Computational identification and classification of regularities within complex data.
A mathematical function that quantifies what a machine learning model is optimizing.
A learning paradigm where an agent maximizes cumulative rewards through environmental interaction.
Massive neural networks trained on text to understand and generate human language.
A neural network architecture using self-attention to process sequential data in parallel.
Training models on labeled input-output pairs to predict or classify new data.
A model that learns data distributions to synthesize realistic new samples.
The number of independent axes defining a vector space used to represent data.
A hypothetical AI system capable of performing any intellectual task a human can.
Training a compact student model to mimic a larger teacher model's behavior through soft target distributions.
A technique that penalizes model complexity to prevent overfitting and improve generalization.
A neural network technique that dynamically weights input elements by their relevance to the task.
Drawing conclusions from uncertain or incomplete data using probability theory.
The fraction of correct predictions a classification model makes overall.
Iteratively adjusting model parameters to minimize prediction error measured by a loss function.
An autonomous system that perceives its environment and acts to achieve goals.
A method that fits models to data by minimizing squared prediction errors.
When a model memorizes training data noise instead of learning generalizable patterns.
A dense vector representation that encodes semantic relationships between discrete items.
A function describing the relative likelihood of a continuous random variable's values.
A statistical framework for making data-driven decisions by evaluating competing claims.
A parameter estimation method that finds values making observed data most probable.