80 research outputs found
Identifying Semantic Divergences in Parallel Text without Annotations
Recognizing that even correct translations are not always semantically
equivalent, we automatically detect meaning divergences in parallel sentence
pairs with a deep neural model of bilingual semantic similarity which can be
trained for any parallel corpus without any manual annotation. We show that our
semantic model detects divergences more accurately than models based on surface
features derived from word alignments, and that these divergences matter for
neural machine translation.Comment: Accepted as a full paper to NAACL 201
How To Control Text Simplification? An Empirical Study of Control Tokens for Meaning Preserving Controlled Simplification
Text simplification rewrites text to be more readable for a specific
audience, while preserving its meaning. However, determining what makes a text
easy to read depends on who are the intended readers. Recent work has
introduced a wealth of techniques to control output simplicity, ranging from
specifying the desired reading grade level to providing control tokens that
directly encode low-level simplification edit operations. However, it remains
unclear how to set the input parameters that control simplification in
practice. Existing approaches set them at the corpus level, disregarding the
complexity of individual source text, and do not directly evaluate them at the
instance level. In this work, we conduct an empirical study to understand how
different control mechanisms impact the adequacy and simplicity of model
outputs. Based on these insights, we introduce a simple method for predicting
control tokens at the sentence level to enhance the quality of the simplified
text. Predicting control token values using features extracted from the
original complex text and a user-specified degree of complexity improves the
quality of the simplified outputs over corpus-level search-based heuristics.Comment: work in progres
Towards Conceptualization of "Fair Explanation": Disparate Impacts of anti-Asian Hate Speech Explanations on Content Moderators
Recent research at the intersection of AI explainability and fairness has
focused on how explanations can improve human-plus-AI task performance as
assessed by fairness measures. We propose to characterize what constitutes an
explanation that is itself "fair" -- an explanation that does not adversely
impact specific populations. We formulate a novel evaluation method of "fair
explanations" using not just accuracy and label time, but also psychological
impact of explanations on different user groups across many metrics (mental
discomfort, stereotype activation, and perceived workload). We apply this
method in the context of content moderation of potential hate speech, and its
differential impact on Asian vs. non-Asian proxy moderators, across explanation
approaches (saliency map and counterfactual explanation). We find that saliency
maps generally perform better and show less evidence of disparate impact
(group) and individual unfairness than counterfactual explanations.
Content warning: This paper contains examples of hate speech and racially
discriminatory language. The authors do not support such content. Please
consider your risk of discomfort carefully before continuing reading!Comment: EMNLP 2023 Main Conference (Long Paper
- …