172 research outputs found
Censored and Fair Universal Representations using Generative Adversarial Models
We present a data-driven framework for learning \textit{censored and fair
universal representations} (CFUR) that ensure statistical fairness guarantees
for all downstream learning tasks that may not be known \textit{a priori}. Our
framework leverages recent advancements in adversarial learning to allow a data
holder to learn censored and fair representations that decouple a set of
sensitive attributes from the rest of the dataset. The resulting problem of
finding the optimal randomizing mechanism with specific fairness/censoring
guarantees is formulated as a constrained minimax game between an encoder and
an adversary where the constraint ensures a measure of usefulness (utility) of
the representation. We show that for appropriately chosen adversarial loss
functions, our framework enables defining demographic parity for fair
representations and also clarifies {the optimal adversarial strategy against
strong information-theoretic adversaries}. We evaluate the performance of our
proposed framework on multi-dimensional Gaussian mixture models and publicly
datasets including the UCI Census, GENKI, Human Activity Recognition (HAR), and
the UTKFace. Our experimental results show that multiple sensitive features can
be effectively censored while ensuring accuracy for several \textit{a priori}
unknown downstream tasks. Finally, our results also make precise the tradeoff
between censoring and fidelity for the representation as well as the
fairness-utility tradeoffs for downstream tasks.Comment: 45 pages, 23 Figures. arXiv admin note: text overlap with
arXiv:1807.0530
Latent Space Smoothing for Individually Fair Representations
Fair representation learning encodes user data to ensure fairness and
utility, regardless of the downstream application. However, learning
individually fair representations, i.e., guaranteeing that similar individuals
are treated similarly, remains challenging in high-dimensional settings such as
computer vision. In this work, we introduce LASSI, the first representation
learning method for certifying individual fairness of high-dimensional data.
Our key insight is to leverage recent advances in generative modeling to
capture the set of similar individuals in the generative latent space. This
allows learning individually fair representations where similar individuals are
mapped close together, by using adversarial training to minimize the distance
between their representations. Finally, we employ randomized smoothing to
provably map similar individuals close together, in turn ensuring that local
robustness verification of the downstream application results in end-to-end
fairness certification. Our experimental evaluation on challenging real-world
image data demonstrates that our method increases certified individual fairness
by up to 60%, without significantly affecting task utility
Adversarial Removal of Demographic Attributes from Text Data
Recent advances in Representation Learning and Adversarial Training seem to
succeed in removing unwanted features from the learned representation. We show
that demographic information of authors is encoded in -- and can be recovered
from -- the intermediate representations learned by text-based neural
classifiers. The implication is that decisions of classifiers trained on
textual data are not agnostic to -- and likely condition on -- demographic
attributes. When attempting to remove such demographic information using
adversarial training, we find that while the adversarial component achieves
chance-level development-set accuracy during training, a post-hoc classifier,
trained on the encoded sentences from the first part, still manages to reach
substantially higher classification accuracies on the same data. This behavior
is consistent across several tasks, demographic properties and datasets. We
explore several techniques to improve the effectiveness of the adversarial
component. Our main conclusion is a cautionary one: do not rely on the
adversarial training to achieve invariant representation to sensitive features
Adversarial training approach for local data debiasing
The widespread use of automated decision processes in many areas of our
society raises serious ethical issues concerning the fairness of the process
and the possible resulting discriminations. In this work, we propose a novel
approach called GANsan whose objective is to prevent the possibility of any
discrimination i.e., direct and indirect) based on a sensitive attribute by
removing the attribute itself as well as the existing correlations with the
remaining attributes. Our sanitization algorithm GANsan is partially inspired
by the powerful framework of generative adversarial networks (in particular the
Cycle-GANs), which offers a flexible way to learn a distribution empirically or
to translate between two different distributions.
In contrast to prior work, one of the strengths of our approach is that the
sanitization is performed in the same space as the original data by only
modifying the other attributes as little as possible and thus preserving the
interpretability of the sanitized data. As a consequence, once the sanitizer is
trained, it can be applied to new data, such as for instance, locally by an
individual on his profile before releasing it. Finally, experiments on a real
dataset demonstrate the effectiveness of the proposed approach as well as the
achievable trade-off between fairness and utility
- …