How an AI system encodes its environment into a structured, processable description.
State representation refers to the way an AI system captures and encodes information about its environment into a format that algorithms can process and reason over. Rather than working with raw, unstructured sensory data, an agent relies on a state representation to distill the most relevant features of its current situation — whether that means pixel values in a video game, joint angles in a robotic arm, or token embeddings in a language model. The choice of representation fundamentally shapes what an agent can learn and how quickly it can learn it.
In reinforcement learning, state representation is especially consequential. An agent's policy — the mapping from states to actions — can only be as good as the information encoded in the state. A poorly designed representation may omit critical details, conflate distinct situations, or include irrelevant noise, all of which degrade learning efficiency and final performance. Conversely, a compact and expressive representation allows the agent to generalize effectively across similar situations, accelerating convergence and enabling robust behavior in novel contexts. This is why hand-crafted feature engineering dominated early RL work, and why learning representations end-to-end through deep neural networks became so transformative.
Deep learning has largely shifted the burden of state representation from human designers to learned feature extractors. Convolutional networks, recurrent architectures, and attention mechanisms can automatically discover useful abstractions from high-dimensional inputs, enabling agents to operate directly on raw observations. Techniques like representation learning, world models, and self-supervised pretraining have further advanced the field by training encoders that capture environment dynamics without requiring dense reward signals.
State representation sits at the intersection of perception, memory, and reasoning, making it relevant far beyond reinforcement learning. In planning systems, the state space defines what configurations are reachable and how search proceeds. In partially observable environments, agents must maintain belief states or memory-augmented representations to compensate for missing information. As AI systems tackle increasingly complex real-world tasks, designing or learning effective state representations remains one of the central challenges in the field.