通过边际公平的界限和近似交叉公平

论文标题

通过边际公平的界限和近似交叉公平

Bounding and Approximating Intersectional Fairness through Marginal Fairness

论文作者

Molina, Mathieu, Loiseau, Patrick

论文摘要

机器学习中的歧视通常沿多个维度（又称保护属性）出现；因此，希望确保\ emph {交叉公平} - 即，没有任何子组受到歧视。众所周知，确保每个维度独立的\ emph {边际公平}通常不够。但是，由于亚组的指数数量，直接测量数据交叉公平性是不可能的。在本文中，我们的主要目标是通过统计分析详细了解边际和交叉公平之间的关系。我们首先确定一组足够的条件，在这些条件下可以获得确切的关系。然后，在一般情况下，我们可以在相互偏见的高概率上证明（通过边际公平和其他有意义的统计量来易于计算）。除了它们的描述性价值之外，我们还可以证明这些理论界限可以通过以相关的方式选择相交子组的保护属性来推导启发式启发式，从而提高了交叉公平的近似和边界。最后，我们测试了实际和合成数据集的近似值和界限的性能。

Discrimination in machine learning often arises along multiple dimensions (a.k.a. protected attributes); it is then desirable to ensure \emph{intersectional fairness} -- i.e., that no subgroup is discriminated against. It is known that ensuring \emph{marginal fairness} for every dimension independently is not sufficient in general. Due to the exponential number of subgroups, however, directly measuring intersectional fairness from data is impossible. In this paper, our primary goal is to understand in detail the relationship between marginal and intersectional fairness through statistical analysis. We first identify a set of sufficient conditions under which an exact relationship can be obtained. Then, we prove bounds (easily computable through marginal fairness and other meaningful statistical quantities) in high-probability on intersectional fairness in the general case. Beyond their descriptive value, we show that these theoretical bounds can be leveraged to derive a heuristic improving the approximation and bounds of intersectional fairness by choosing, in a relevant manner, protected attributes for which we describe intersectional subgroups. Finally, we test the performance of our approximations and bounds on real and synthetic data-sets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题