A layered system of interconnected nodes that learns patterns from data.
A neural network is a computational model loosely inspired by the structure of biological brains, built from layers of interconnected units called neurons or nodes. Information enters through an input layer, passes through one or more hidden layers where transformations are applied, and produces an output. Each connection between neurons carries a numerical weight that determines how strongly one neuron influences another. By adjusting these weights during training, the network learns to map inputs to desired outputs without being explicitly programmed with hand-crafted rules.
Training a neural network typically involves feeding it labeled examples and measuring the difference between its predictions and the correct answers using a loss function. An optimization algorithm — most commonly stochastic gradient descent — then propagates error signals backward through the network via backpropagation, nudging weights in directions that reduce the loss. Repeated over many iterations and large datasets, this process allows the network to discover rich internal representations of the data, capturing features that would be difficult or impossible to specify manually.
Neural networks vary widely in architecture depending on the task. Convolutional neural networks (CNNs) exploit spatial structure and dominate image recognition. Recurrent neural networks (RNNs) and transformers handle sequential data like text and audio. Generative models such as GANs and diffusion networks learn to synthesize new data. The depth of modern networks — sometimes hundreds of layers — is what distinguishes contemporary deep learning from earlier shallow approaches, enabling dramatically better performance on complex tasks.
The practical impact of neural networks is difficult to overstate. They underpin speech assistants, real-time translation, medical image diagnosis, protein structure prediction, and large language models. Their ability to scale with data and compute has made them the dominant paradigm in machine learning, displacing many classical statistical methods. Understanding neural networks — how they are structured, trained, and regularized — is foundational knowledge for anyone working in modern AI.