Section 1 引言


Section 2 机器学习中的公平性:关键的方法论组成部分


 2.1 Sensitive and Protected Variables and (Un)privileged Groups

Most approaches to mitigate unfairness, bias, or discrimination are based on the notion of protected or sensitive variables (we will use the terms interchangeably) and on (un)privileged groups: groups (often defifined by one or more sensitive variables) that are disproportionately (less) more likely to be positively classifified.
1. 法律明确定义的——“受保护”
2. 但仍需关注是否应该保护其他少数变量,有一些工作专注于识别潜在敏感变量
3. 有些变量不是严格敏感的,但与一个或多个敏感变量有关系—— “related” variables
4. 不考虑这些 “related” variables可能会错误地假设已经产生了一个公平的ML模型 \rightarrow increase the risk of discrimination
5. 有关Proxy 译为代理 可以先看这篇 解释的 代理变量部分

 2.2 Metrics

Metrics usually either emphasize individual (e.g. everyone is treated equal), or group fairness, where the latter is further differentiated to within group (e.g. women vs. men) and between group (e.g. young women vs. black men) fairness.

Increasing fairness often results in lower overall accuracy or related metrics, leading to the necessity of analyzing potentially achievable trade-offs in a given scenario.

2.3 Pre-processing 预处理


2.4 In-processing 在处理
2.5 Post-processing 后处理
2.6 pre-processing vs. in-processing vs. post-processing
A distinct advantage of pre- and post-processing approaches is that they do not modify the ML method explicitly. However, they have no direct control over the optimization function of the ML model itself.
This means that (open source) ML libraries can be leveraged unchanged for model training. Only in-processing approaches can optimize notions of fairness during model training. Yet, this requires the optimization function to be either accessible, replaceable,  and/or modififiable, which may not always be the case.

 Section 3 度量公平与偏见

3.1 Abstract Fairness Criteria


        ① 敏感变量S(区别受保护群体和非保护群体);② 目标变量Y(真实的类别);③ 分类分数R(预测的分类结果)

        基于此三要素,general fairness desiderata被分为三个“非歧视”标准:

        ① Independence:评分R独立于敏感变量S,e.g., Statistical/Demographic Parity.

        ② Separation:在已知目标变量Y值的条件下,评分R独立于敏感变量S,e.g., Equalized Odds和Equal Opportunity.

        ③ Suffificiency:在已知评分R的条件下,目标变量Y独立于敏感变量S.

3.2 Group Fairness Metrics
3.2.1 Parity-based Metrics
Parity-based metrics typically consider the predicted positive rates, i.e., P_{r}(\widehat{y}=1) , across different groups.
e.g., Statistical/Demographic Parity: P_{r}(\widehat{y}=1|g_{i})=P_{r}(\widehat{y}=1|g_{j});
        Disparate Impact: \frac{P_{r}(\widehat{y}=1|g_{1})}{P_{r}(\widehat{y}=1|g_{2})}.
3.2.2 Confusion Matrix-based Metrics
While parity-based metrics typically consider variants of the predicted positive rate P_{r}(\widehat{y}=1), confusion matrix-based metrics take into consideration additional aspects such as True Positive Rate (TPR), True Negative Rate (TNR), False Positive Rate (FPR), and False Negative Rate (FNR).

 e.g., Equal Opportunity: 考虑真阳性,P_{r}(\widehat{y}=1|y=1\&g_{i})=P_{r}(\widehat{y}=1|y=1\&g_{j})

         Equalized Odds: 考虑真阳性和假阳性,P_{r}(\widehat{y}=1|y=1\&g_{i})=P_{r}(\widehat{y}=1|y=1\&g_{j})\\ \& \ \ P_{r}(\widehat{y}=1|y=0\&g_{i})=P_{r}(\widehat{y}=1|y=0\&g_{j})

         Overall accuracy equality: 考虑准确性,P_{r}(\widehat{y}=1|y=1\&g_{i})+P_{r}(\widehat{y}=0|y=0\&g_{i}) \\ = \ P_{r}(\widehat{y}=1|y=1\&g_{j})+P_{r}(\widehat{y}=0|y=0\&g_{j})

         Conditional use accuracy equality: 有点不太懂,但是公式在这:P_{r}(y=1|\widehat{y}=1\&g_{i})=P_{r}(y=1|\widehat{y}=1\&g_{j})\\ \& \ \ P_{r}(y=0|\widehat{y}=0\&g_{i})=P_{r}(y=0|\widehat{y}=0\&g_{j})

         Treatment equality: 考虑假阳性与假阴性之比,\frac{P_{r}(\widehat{y}=1|y=0\&g_{i})}{P_{r}(\widehat{y}=0|y=1\&g_{i})}= \frac{P_{r}(\widehat{y}=1|y=0\&g_{j})}{P_{r}(\widehat{y}=0|y=1\&g_{j})}
         Equalizing disincentives: 考虑真阳性与假阳性之差, P_{r}(\widehat{y}=1|y=1\&g_{i})-P_{r}(\widehat{y}=1|y=0\&g_{i}) \\ = \ P_{r}(\widehat{y}=1|y=1\&g_{j})-P_{r}(\widehat{y}=1|y=0\&g_{j})
         Conditional Equal Opportunity: 指定特定属性a上的机会相等,其中τ是一个阈值, P_{r}(\widehat{y}\geq \tau |g_{i} \& y< \tau \& A=a) \\ = \ P_{r}(\widehat{y}\geq \tau |g_{j} \& y< \tau \&A=a)
3.2.3 Calibration-based Metrics
Calibration-based metrics take the predicted probability, or score, into account.
 e.g., Test fairness/ calibration / matching conditional frequencies: P_{r}(\widehat{y}=1|S=s\&g_{i})=P_{r}(\widehat{y}=1|S=s\&g_{j})
         Well calibration: P_{r}(\widehat{y}=1|S=s\&g_{i})=P_{r}(\widehat{y}=1|S=s\&g_{j})=s
         Balance for positive and negative class: 所有组的正类和负类的期望预测分数相等, E(S=s|y=1\&g_{i})=E(S=s|y=1\&g_{j}), \\ \quad E(S=s|y=0\&g_{i})=E(S=s|y=0\&g_{j})
         Bayesian Fairness
3.3 Individual and Counterfactual Fairness Metrics
consider the outcome for each participating individual
e.g., Counterfactual Fairness:反事实公平,
        Generalized Entropy Index:广义熵系数, considers differences in an individual’s prediction ( b i ) to the average prediction accuracy ( µ ),GEI=\frac{1}{n\alpha (\alpha -1)}\sum_{i=1}^{n}[(\frac{b_i}{\mu })^\alpha -1],\ b_i=\widehat{y_i}-y_i+1 \ and \ \mu =\frac{\sum_{i}^{}b_i}{n}
        Theil Index:泰尔熵标准,GEI 当α=1时,简化计算方式为GEI=\frac{1}{n}\sum_{i=1}^{n}(\frac{b_i}{\mu })log(\frac{b_i}{\mu })

Section 4 二分类场景下的公平性研究

the approach of making a classififier “immune” to one or more sensitive variables
Causal Methods
A key objective is to uncover causal relationships in the data and fifind dependencies between sensitive and non-sensitive variables.
Sampling and Subgroup Analysis

① 纠正训练数据;② 通过subgroup analysis找到分类器不利的一方


subgroup analysis 也可用于模型评估,例如分析某一子组是否受歧视,确认某一因素是否影响模型公平性;Statistical hypothesis testing 统计假设检验评价某一模型是否稳健符合公平性指标;通过对敏感变量的抽样,还提出了公平性度量的概率验证,以在某些(小的)置信范围内评估训练过的模型。




Relabelling and Perturbation



Perturbation often aligns with notions of “repairing” some aspect(s) of the data with regard to notions of fairness.
Sensitivity analysis explores how various aspects of the feature vector affect a given outcome. 虽然敏感性分析并不是一种提高公平性的方法,但它可以帮助更好地理解关于公平性的不确定性。
Regularization and Constraint Optimisation



Adversarial Learning
When applied to applications of fairness in ML, an adversary instead seeks to determine whether the training process is fair, and when not, feedback from the adversary is used to improve the model.
Bandit approaches frame the fairness problem as a stochastic multi-armed bandit framework, assigning either individuals to arms, or groups of “similar” individuals to arms, and fairness quality as a reward represented as regret.
The two main notions of fairness that have emerged from the application of bandits are meritocratic fairness(group agnostic) and subjective fairness(emphasises fairness in each time period t of the bandit framework).
Calibration is the process of ensuring that the proportion of positive predictions is equal to the proportion of positive examples.


Thresholding is a post-processing approach which is motivated on the basis that discriminatory decisions are often made close to decision making boundaries because of a decision maker’s bias [157] and that humans apply threshold rules when making decisions.

Section 5 二分类以外场景下的公平性方法

Fair Regression
Recommender Systems
“C-fairness” for fair user/consumer recommendation (user-based)
“P-fairness” for fairness of producer recommendation (item-based)


Unsupervised Methods
1) fair clustering
2) investigating the presence and detection of discrimination in association rule mining
3) transfer learning 迁移学习
Unintended biases have also been noticed in NLP; these are often gender or race focused.

Section 6 Current Platforms 开源工具

Set of tools that provides several pre-, in-, and post-processing approaches for binary classifification as well as several pre-implemented datasets that are commonly used in Fairness research
Implements several parity-based fairness measures and algorithms for binary classifification and regression as well as a dashboard to visualize disparity in accuracy and parity.
Open source bias audit toolkit. Focuses on standard ML metrics and their evaluation for different subgroups of a protective attribute.
Provides datasets, metrics, and algorithms to measure and mitigate bias in classifification as well as NLP (bias in word embeddings).
Tool that provides commonly used fairness metrics (e.g., statistical parity, equalized odds) for R projects.
Generic framework that provides measures and statistical tests to detect unwanted associations between the output of an algorithm and a sensitive attribute.
Fairness Measures
Project that considers quantitative defifinitions of discrimination in classifification and ranking scenarios. Provides datasets, measures, and algorithms (for ranking) that investigate fairness.
Audit AI
Implements various statistical signifificance tests to detect discrimination between groups and bias from standard machine learning procedures.
Dataset Nutrition Label
Generates qualitative and quantitative measures and descriptions of dataset health to assess the quality of a dataset used for training and building ML models.
ML Fairness Gym
Part of Google’s Open AI project, a simulation toolkit to study long-run impacts of ML decisions. Analyzes how algorithms that take fairness into consideration change the underlying data (previous classififications) over time.

Section 7 Concluding Remarks: The Fairness Dilemmas 公平困境

① Balancing the tradeoff between fairness and model performance

Quantitative notions of fairness permit model optimization, yet cannot balance different notions of fairness, i.e individual vs. group fairness

Tensions between fairness, situational, ethical, and sociocultural context, and policy

Recent advances to the state of the art have increased the skills gap inhibiting “man-on-the-street”

and industry uptake





