A visual reasoning benchmark used to evaluate abstract pattern recognition in AI systems.
Raven's Progressive Matrices (RPM) is a nonverbal intelligence test consisting of visual puzzles in which a subject must identify the missing piece that completes a pattern within a matrix of geometric figures. Each puzzle presents a grid of shapes governed by underlying rules — such as transformations in size, rotation, color, or quantity — and the solver must infer those rules to select the correct answer from a set of candidates. Originally designed to measure fluid intelligence in humans without relying on language or prior knowledge, RPM has become a widely used benchmark for evaluating abstract reasoning in AI systems.
In machine learning research, RPM serves as a challenging testbed for visual and relational reasoning. Solving these matrices requires more than pattern matching; it demands the ability to identify abstract relationships, apply multi-step logical rules, and generalize across novel configurations. Convolutional neural networks and other standard deep learning architectures have historically struggled with RPM tasks, motivating the development of more structured approaches such as graph neural networks, relation networks, and neuro-symbolic models that explicitly represent and reason over relational structure.
Several synthetic datasets have been constructed to support RPM research in AI, most notably the Procedurally Generated Matrices (PGM) dataset and the RAVEN dataset, which provide large-scale training and evaluation sets with controlled rule complexity. These benchmarks allow researchers to probe specific reasoning capabilities — such as counting, progression, or analogy — and to measure how well models generalize to unseen rule combinations. Performance on these datasets has become a meaningful proxy for progress toward human-like abstract reasoning.
RPM occupies an important place in the broader effort to build AI systems capable of general reasoning rather than narrow task performance. Because the test is designed to be culturally neutral and requires no domain-specific knowledge, strong performance signals genuine inferential ability rather than memorized associations. As AI research increasingly targets compositional generalization and systematic reasoning, Raven's Progressive Matrices remains one of the most principled and interpretable tools for measuring progress.