Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Counterfactual Fairness

Counterfactual Fairness

A fairness criterion ensuring model decisions are unchanged when sensitive attributes are hypothetically altered.

Year: 2017Generality: 491
Back to Vocab

Counterfactual fairness is a formal criterion for algorithmic fairness that asks whether a model's decision would remain the same if an individual's sensitive attribute — such as race, gender, or age — had been different, while all other causally downstream variables were adjusted accordingly. Unlike simpler fairness metrics that operate on statistical distributions across groups, counterfactual fairness is grounded in causal inference and requires constructing an explicit causal model of the data-generating process. A model satisfies counterfactual fairness if, in the hypothetical world where only the sensitive attribute changes, the predicted outcome for an individual does not change.

The mechanism relies on structural causal models (SCMs), which represent variables as nodes in a directed acyclic graph with explicit functional relationships. To evaluate counterfactual fairness, practitioners intervene on the sensitive attribute within this causal graph and propagate the change through all variables that are causally influenced by it. This is more rigorous than simply removing the sensitive attribute from a model's inputs, because correlated proxy variables — such as zip code or name — can still encode sensitive information and introduce bias through indirect pathways.

Counterfactual fairness matters because it addresses a fundamental limitation of group-level fairness metrics: two models can satisfy demographic parity or equalized odds while still making decisions that are causally determined by sensitive attributes at the individual level. By operating at the level of individual causal counterfactuals, this criterion provides a stronger guarantee that sensitive characteristics are not driving outcomes, even through indirect correlations. This makes it particularly relevant in high-stakes domains like credit scoring, hiring, and criminal justice, where individual-level fairness is both ethically important and legally significant.

Despite its theoretical appeal, counterfactual fairness faces practical challenges. Specifying the correct causal graph requires domain expertise and is often contested, and the approach can be sensitive to modeling assumptions. Estimating counterfactual quantities from observational data is also statistically difficult. As a result, the framework is most useful as a conceptual standard and a tool for auditing, even when full implementation is not feasible.

Related

Related

Counterfactual Explanations
Counterfactual Explanations

Explanations showing which input changes would have produced a different model output.

Generality: 603
Fairness-Aware Machine Learning
Fairness-Aware Machine Learning

Building ML algorithms that produce equitable outcomes across demographic groups.

Generality: 694
Algorithmic Bias
Algorithmic Bias

Systematic unfairness embedded in algorithmic outputs due to biased data or design.

Generality: 792
Adversarial Debiasing
Adversarial Debiasing

A technique that uses adversarial training to reduce bias toward sensitive attributes.

Generality: 340
Causal Inference
Causal Inference

Statistical methods for determining cause-and-effect relationships between variables.

Generality: 796
Coverage Bias
Coverage Bias

A dataset imbalance where underrepresented groups cause skewed model performance.

Generality: 520