Techniques that reduce unfair bias in machine learning models and their outputs.
De-biasing refers to the collection of methods and practices used to identify, measure, and reduce harmful biases in machine learning systems. These biases can enter a model at multiple stages: through training data that reflects historical inequities or underrepresents certain groups, through architectural choices that inadvertently encode stereotypes, or through objective functions that optimize for aggregate performance while ignoring disparate impacts across subpopulations. Left unaddressed, biased models can perpetuate or amplify discrimination in high-stakes domains such as hiring, credit scoring, medical diagnosis, and criminal justice.
De-biasing techniques operate at several levels of the machine learning pipeline. Pre-processing approaches modify or reweight training data before a model is trained—for example, by resampling underrepresented groups or applying fairness-aware data augmentation. In-processing methods incorporate fairness constraints or regularization terms directly into the training objective, penalizing models that produce disparate outcomes across demographic groups. Post-processing techniques adjust a trained model's outputs, such as recalibrating decision thresholds separately for different groups to equalize error rates. Each approach involves trade-offs, and no single method universally resolves all forms of bias.
A central challenge in de-biasing is that "fairness" is not a single, mathematically consistent concept. Researchers have identified dozens of formal fairness criteria—demographic parity, equalized odds, calibration, and individual fairness, among others—and it is often mathematically impossible to satisfy multiple criteria simultaneously. This means that de-biasing is inherently a value-laden exercise requiring explicit choices about which harms to prioritize, informed by domain expertise, legal frameworks, and input from affected communities.
De-biasing has grown into a major research area as AI deployment in consequential decisions became widespread in the mid-2010s. It sits at the intersection of machine learning, ethics, law, and social science, and is now considered a core component of responsible AI development. Regulatory frameworks in the EU and elsewhere increasingly mandate bias audits and fairness documentation, making de-biasing not just an ethical imperative but a compliance requirement for organizations deploying AI systems.