The basic computational unit of neural networks, modeled on biological neurons.
An artificial neuron is the fundamental processing unit of an artificial neural network, loosely inspired by the behavior of biological neurons in the brain. Each artificial neuron receives one or more numerical inputs, multiplies each by a learned weight that reflects its relative importance, sums the weighted inputs together (often adding a bias term), and then passes the result through a nonlinear activation function. The output of this activation function is then forwarded to other neurons in the network or used as a final prediction. This simple computation, repeated across thousands or millions of interconnected neurons, enables neural networks to model extraordinarily complex relationships in data.
The mechanics of how artificial neurons learn are inseparable from the broader training process of neural networks. During training, the weights associated with each neuron's inputs are iteratively adjusted via backpropagation and gradient descent to minimize prediction error. The choice of activation function — whether a sigmoid, hyperbolic tangent, or rectified linear unit (ReLU) — significantly shapes how information flows through the network and how effectively gradients propagate during learning. ReLU in particular became dominant in deep learning because it mitigates the vanishing gradient problem that plagued earlier sigmoid-based networks.
Artificial neurons matter because they are the atomic unit from which all modern neural architectures are constructed. Convolutional neural networks, recurrent networks, transformers, and generative models are all ultimately composed of neurons performing weighted summations followed by nonlinear transformations. Understanding the artificial neuron is therefore prerequisite to understanding virtually every major advance in deep learning, from image classification and speech recognition to large language models. The scalability of neuron-based computation — from a handful of units in early perceptrons to billions in contemporary models — has been central to the dramatic expansion of AI capabilities over the past decade.
The concept traces back to the McCulloch-Pitts model of 1943 and was operationalized for learning with Frank Rosenblatt's perceptron in 1958, the point at which the artificial neuron became a practical machine learning construct rather than a purely theoretical one.