AI has enabled industries to automate their processes, make accurate predictions, and optimize how they make decisions. However, AI models have brought about ethical and societal concerns, especially around fairness.
To mitigate against bias, researchers have developed several fairness metrics to monitor. These metrics fall under two categories, which we will observe in detail.
These metrics compare results between various protected or demographic groups:
A model’s outcome must be independent of specified demographic attributes. In the example of an AI model that screens through resumes, demographic parity is achieved if the selection rate for all genders is the same.
Demographic parity does not, however, take into account other important factors, such as the qualification of the individual. If ML models fail to address demographic parity, the result can be allocation harms - where information, resources, or opportunities are distributed unevenly across different groups.
There must be equal possibility of experiencing positive outcomes across monitored groups. For instance, in the case of job appraisals, men and women should be equally likely to receive promotions, and neither should be at an advantage or disadvantage due to their gender.
There must be equal and fair access to opportunities, irrespective of demographics. This metric helps to eliminate bias by ensuring that decisions are based on merits instead of protected attributes.
To accomplish this, developers and researchers design AI systems that are fair and unbiased while addressing any historical disadvantages.
Let us take the hypothetical scenario where MIT University admits both Greenfield High and East Boston High School students. Greenfield High School provides a comprehensive curriculum of math courses, and most of its students qualify for the university program.
Suppose that East Boston High School doesn’t offer rigorous math classes, and far fewer students qualify.
Equality of opportunity requires that the chances of admission to MIT be the same irrespective of whether the students come from either high school.
These metrics demand that similar people receive equal treatment regardless of the protected attribute. The notion was first presented by Cynthia Dwork in 2012 in her foundational paper, Fairness Through Awareness.
The model’s prediction must be equally accurate across different demographic groups based on a sensitive attribute.
For example, the chances of an unqualified applicant not being hired and that of a qualified applicant being hired should be equal across all protected characteristics.
Equalized odds do not account for the potential negative effects caused by the errors, which may differ based on the stakes and context.
This metric measures the accuracy of predicted probabilities. Calibration ensures that the models used do not result in decisions based on inaccurate assumptions. If a model is miscalibrated, its outcomes can be unfair.
Calibration is applied in AI models that provide probability estimates, including support vector machines, logistic regression, and neural networks.
Fairness in AI models represents an important ethical responsibility. If bias exists in these models, the result can be unjust outcomes and social inequality for certain demographic groups. Therefore, to mitigate these biases, fairness metrics must be incorporated into AI systems from the start.