A basic computational unit in neural networks or graphs that processes information.
In machine learning, a node is the fundamental processing unit found in neural networks and graphical models. In a neural network, each node receives one or more numerical inputs, multiplies each by a learned weight, sums the results, adds a bias term, and passes the total through a nonlinear activation function to produce an output signal. That output is then forwarded to nodes in the next layer, enabling the network to build increasingly abstract representations of the input data across successive layers.
In graph-based models such as Bayesian networks or Markov random fields, nodes take on a different but related role: they represent random variables or observable quantities, while the edges connecting them encode conditional dependencies or probabilistic relationships. Inference algorithms traverse these nodes to propagate beliefs or compute marginal distributions, making nodes equally central to probabilistic reasoning as they are to deep learning.
The behavior of a node is almost entirely determined by its weights and the choice of activation function. Common activation functions—sigmoid, tanh, ReLU, and its variants—each shape how a node responds to its inputs and directly influence a network's capacity to learn nonlinear patterns. During training, backpropagation computes gradients with respect to each node's parameters, and an optimizer adjusts those parameters to minimize prediction error. The collective, coordinated adjustment of millions of nodes is what allows deep networks to learn complex mappings from raw data to useful predictions.
Understanding nodes is essential for diagnosing and improving neural network behavior. Concepts like dead neurons (ReLU units that never activate), vanishing gradients (signals that shrink to near zero as they pass through many nodes), and dropout regularization (randomly deactivating nodes during training) all hinge on how individual nodes function. Whether designing a small feedforward classifier or a massive transformer, the node remains the irreducible unit from which all network complexity emerges.