A neural network that encodes continuous signals by mapping coordinates to signal values.
An Implicit Neural Representation (INR) is a neural network—typically a multilayer perceptron (MLP)—trained to represent a signal as a continuous function that maps input coordinates to signal attributes. Rather than storing data in discrete grids like voxels or pixels, an INR encodes all information in its learned weights, producing outputs at any queried coordinate with arbitrary resolution. This coordinate-based paradigm applies broadly: spatial coordinates map to color or density in 3D scenes, time-frequency coordinates map to audio amplitude, or spatial positions map to signed distances from a surface.
The core mechanism is straightforward: given a coordinate input x, the network fθ(x) predicts the corresponding signal value through a series of learned transformations. A critical practical challenge is spectral bias—standard MLPs with ReLU activations struggle to represent high-frequency details. Two major solutions emerged: positional encodings and Fourier feature mappings that lift inputs into a higher-dimensional space before processing, and periodic activation functions (as in SIREN networks) that naturally capture oscillatory structure. For multi-instance settings, networks are conditioned on latent codes that describe individual shapes or scenes, allowing a single model to generalize across many examples. More recent hybrid approaches like Instant-NGP combine learned feature grids with small MLPs, dramatically accelerating query times and enabling real-time applications.
INRs became central to modern neural rendering and 3D understanding. Neural Radiance Fields (NeRF), introduced in 2020, demonstrated that an INR conditioned on spatial position and viewing direction could synthesize photorealistic novel views of a scene from a sparse set of images—a result that catalyzed enormous research activity. Parallel work on DeepSDF and Occupancy Networks established INRs as powerful shape representations for reconstruction and generation. Beyond vision, INRs have found use in medical imaging, physics-informed simulation, signal compression, and inverse problems across scientific domains.
The appeal of INRs lies in several complementary properties: they are resolution-agnostic, producing outputs at any coordinate without resampling artifacts; they are fully differentiable, enabling gradient-based optimization through the representation itself; and they can be compact, storing complex signals in relatively few parameters. Ongoing challenges include slow per-scene optimization, difficulty generalizing across diverse scenes without retraining, and inference speed—problems that hybrid grid-network architectures and amortized inference methods continue to address.