9 research outputs found
Step by Step to Fairness: Attributing Societal Bias in Task-oriented Dialogue Systems
Recent works have shown considerable improvements in task-oriented dialogue
(TOD) systems by utilizing pretrained large language models (LLMs) in an
end-to-end manner. However, the biased behavior of each component in a TOD
system and the error propagation issue in the end-to-end framework can lead to
seriously biased TOD responses. Existing works of fairness only focus on the
total bias of a system. In this paper, we propose a diagnosis method to
attribute bias to each component of a TOD system. With the proposed attribution
method, we can gain a deeper understanding of the sources of bias.
Additionally, researchers can mitigate biased model behavior at a more granular
level. We conduct experiments to attribute the TOD system's bias toward three
demographic axes: gender, age, and race. Experimental results show that the
bias of a TOD system usually comes from the response generation model
Language (Technology) is Power: A Critical Survey of "Bias" in NLP
We survey 146 papers analyzing "bias" in NLP systems, finding that their
motivations are often vague, inconsistent, and lacking in normative reasoning,
despite the fact that analyzing "bias" is an inherently normative process. We
further find that these papers' proposed quantitative techniques for measuring
or mitigating "bias" are poorly matched to their motivations and do not engage
with the relevant literature outside of NLP. Based on these findings, we
describe the beginnings of a path forward by proposing three recommendations
that should guide work analyzing "bias" in NLP systems. These recommendations
rest on a greater recognition of the relationships between language and social
hierarchies, encouraging researchers and practitioners to articulate their
conceptualizations of "bias"---i.e., what kinds of system behaviors are
harmful, in what ways, to whom, and why, as well as the normative reasoning
underlying these statements---and to center work around the lived experiences
of members of communities affected by NLP systems, while interrogating and
reimagining the power relations between technologists and such communities
Towards Knowledge-Grounded Counter Narrative Generation for Hate Speech
Tackling online hatred using informed textual responses - called counter
narratives - has been brought under the spotlight recently. Accordingly, a
research line has emerged to automatically generate counter narratives in order
to facilitate the direct intervention in the hate discussion and to prevent
hate content from further spreading. Still, current neural approaches tend to
produce generic/repetitive responses and lack grounded and up-to-date evidence
such as facts, statistics, or examples. Moreover, these models can create
plausible but not necessarily true arguments. In this paper we present the
first complete knowledge-bound counter narrative generation pipeline, grounded
in an external knowledge repository that can provide more informative content
to fight online hatred. Together with our approach, we present a series of
experiments that show its feasibility to produce suitable and informative
counter narratives in in-domain and cross-domain settings.Comment: To appear in "Proceedings of the 59th Annual Meeting of the
Association for Computational Linguistics (ACL): Findings