A framework that learns from structured, relational data involving multiple interdependent entities.
Statistical Relational Learning (SRL) is a subfield of machine learning that addresses the challenge of building predictive models over data with rich relational structure. Unlike classical machine learning, which typically assumes data points are independent and identically distributed, SRL explicitly models dependencies between entities and the relationships connecting them. This makes it well-suited for domains where the connections between objects carry as much predictive signal as the objects themselves — social networks, knowledge graphs, biological interaction networks, and multi-relational databases are canonical examples.
SRL achieves this by combining ideas from two historically separate traditions: probabilistic graphical models and first-order logic or relational representations. Frameworks such as Markov Logic Networks (MLNs), Probabilistic Relational Models (PRMs), and Probabilistic Soft Logic (PSL) allow practitioners to write logical rules or relational templates that are then grounded into probabilistic models over specific datasets. This templated approach enables parameter sharing across similar relational patterns, giving SRL models a form of structural generalization that purely propositional models lack.
The practical importance of SRL lies in its ability to propagate uncertainty and information across a graph of entities. In fraud detection, for instance, a suspicious transaction can raise the estimated risk of connected accounts even without direct evidence against them. In drug discovery, SRL models can predict protein-protein interactions by jointly reasoning over known biological relationships. This collective classification — updating beliefs about entities based on the inferred labels of their neighbors — is a hallmark capability of the SRL paradigm and distinguishes it from models that score entities in isolation.
SRL gained significant momentum in the machine learning community during the early-to-mid 2000s, driven by growing interest in structured prediction and the limitations of flat, feature-vector representations for complex real-world data. While deep learning on graphs (graph neural networks) has since emerged as a dominant alternative for many relational tasks, SRL frameworks remain valuable where interpretability, explicit relational reasoning, and incorporation of domain knowledge in logical form are priorities.