论文标题
公平表示与几个受保护的类
Fair Representation Clustering with Several Protected Classes
论文作者
论文摘要
我们研究了公平的$ k $ -median的问题,其中每个集群都必须对不同群体的个人进行公平代表。在公平代表$ k $ -Median问题中,我们在公制空间中获得了一组点$ x $。 x $中的每个点$ x \属于$ \ ell $组之一。此外,我们将获得公平表示参数$α_j$和$β_j$的每个组$ j \ in [\ ell] $。我们说,如果$ k $ c_1,\ cdots,c_k $公平地表示所有组,如果compuster $ c_i $ in Compuster $ j $ c_i $ in $α_j| c_i | c_i | $ c_i | $和$β_j| c_i | c_i | $ in [\ ell] $ in [\ ell] $ in [\ ell]和$ i \ in [k] $。目的是找到$ k $中心的$ \ nathcal {c} $ $ ϕ:x \ rightarrow \ mathcal {c} $,以便由$(\ nathcal {c},ϕ)$定义的群集,公平地表示所有组,并最小化$ \ ell_1 $ \ ell_1 $ - imp_ $ \ - ϕ(x))$。 我们提出了一个$ o(\ log k)$ - 近似算法,该算法在时间$ n^{o(\ ell)} $中运行。请注意,该问题的已知算法要么(i)违反了添加期术语的公平约束,要么(ii)在$ k $和$ \ ell $中均及时运行。我们还考虑了问题的重要特殊情况,其中$α_j=β_j= \ frac {f_j} {f} $ and $ f_j,f \ in \ mathbb {n} $ for in [\ ell] $中的所有$ j \ in \ mathbb {n} $。对于此特殊情况,我们提供了一个$ O(\ log K)$ - 近似算法,该算法以$(kf)^{o(\ ell)} \ log n + n + poly(n)$ time运行。
We study the problem of fair $k$-median where each cluster is required to have a fair representation of individuals from different groups. In the fair representation $k$-median problem, we are given a set of points $X$ in a metric space. Each point $x\in X$ belongs to one of $\ell$ groups. Further, we are given fair representation parameters $α_j$ and $β_j$ for each group $j\in [\ell]$. We say that a $k$-clustering $C_1, \cdots, C_k$ fairly represents all groups if the number of points from group $j$ in cluster $C_i$ is between $α_j |C_i|$ and $β_j |C_i|$ for every $j\in[\ell]$ and $i\in [k]$. The goal is to find a set $\mathcal{C}$ of $k$ centers and an assignment $ϕ: X\rightarrow \mathcal{C}$ such that the clustering defined by $(\mathcal{C}, ϕ)$ fairly represents all groups and minimizes the $\ell_1$-objective $\sum_{x\in X} d(x, ϕ(x))$. We present an $O(\log k)$-approximation algorithm that runs in time $n^{O(\ell)}$. Note that the known algorithms for the problem either (i) violate the fairness constraints by an additive term or (ii) run in time that is exponential in both $k$ and $\ell$. We also consider an important special case of the problem where $α_j = β_j = \frac{f_j}{f}$ and $f_j, f \in \mathbb{N}$ for all $j\in [\ell]$. For this special case, we present an $O(\log k)$-approximation algorithm that runs in $(kf)^{O(\ell)}\log n + poly(n)$ time.