30 research outputs found
Mitigating Gender Bias in Machine Learning Data Sets
Artificial Intelligence has the capacity to amplify and perpetuate societal
biases and presents profound ethical implications for society. Gender bias has
been identified in the context of employment advertising and recruitment tools,
due to their reliance on underlying language processing and recommendation
algorithms. Attempts to address such issues have involved testing learned
associations, integrating concepts of fairness to machine learning and
performing more rigorous analysis of training data. Mitigating bias when
algorithms are trained on textual data is particularly challenging given the
complex way gender ideology is embedded in language. This paper proposes a
framework for the identification of gender bias in training data for machine
learning.The work draws upon gender theory and sociolinguistics to
systematically indicate levels of bias in textual training data and associated
neural word embedding models, thus highlighting pathways for both removing bias
from training data and critically assessing its impact.Comment: 10 pages, 5 figures, 5 Tables, Presented as Bias2020 workshop (as
part of the ECIR Conference) - http://bias.disim.univaq.i
Adversarial Reweighting for Speaker Verification Fairness
We address performance fairness for speaker verification using the
adversarial reweighting (ARW) method. ARW is reformulated for speaker
verification with metric learning, and shown to improve results across
different subgroups of gender and nationality, without requiring annotation of
subgroups in the training data. An adversarial network learns a weight for each
training sample in the batch so that the main learner is forced to focus on
poorly performing instances. Using a min-max optimization algorithm, this
method improves overall speaker verification fairness. We present three
different ARWformulations: accumulated pairwise similarity, pseudo-labeling,
and pairwise weighting, and measure their performance in terms of equal error
rate (EER) on the VoxCeleb corpus. Results show that the pairwise weighting
method can achieve 1.08% overall EER, 1.25% for male and 0.67% for female
speakers, with relative EER reductions of 7.7%, 10.1% and 3.0%, respectively.
For nationality subgroups, the proposed algorithm showed 1.04% EER for US
speakers, 0.76% for UK speakers, and 1.22% for all others. The absolute EER gap
between gender groups was reduced from 0.70% to 0.58%, while the standard
deviation over nationality groups decreased from 0.21 to 0.19
Language (Technology) is Power: A Critical Survey of "Bias" in NLP
We survey 146 papers analyzing "bias" in NLP systems, finding that their
motivations are often vague, inconsistent, and lacking in normative reasoning,
despite the fact that analyzing "bias" is an inherently normative process. We
further find that these papers' proposed quantitative techniques for measuring
or mitigating "bias" are poorly matched to their motivations and do not engage
with the relevant literature outside of NLP. Based on these findings, we
describe the beginnings of a path forward by proposing three recommendations
that should guide work analyzing "bias" in NLP systems. These recommendations
rest on a greater recognition of the relationships between language and social
hierarchies, encouraging researchers and practitioners to articulate their
conceptualizations of "bias"---i.e., what kinds of system behaviors are
harmful, in what ways, to whom, and why, as well as the normative reasoning
underlying these statements---and to center work around the lived experiences
of members of communities affected by NLP systems, while interrogating and
reimagining the power relations between technologists and such communities
Mark my Word: A Sequence-to-Sequence Approach to Definition Modeling
International audienceDefining words in a textual context is a useful task both for practical purposes and for gaining insight into distributed word representations. Building on the distribu-tional hypothesis, we argue here that the most natural formalization of definition modeling is to treat it as a sequence-to-sequence task, rather than a word-to-sequence task: given an input sequence with a highlighted word, generate a con-textually appropriate definition for it. We implement this approach in a Transformer-based sequence-to-sequence model. Our proposal allows to train contextualization and definition generation in an end-to-end fashion, which is a conceptual improvement over earlier works. We achieve state-of-the-art results both in contextual and non-contextual definition modeling