661 research outputs found
Exploiting Explainability to Design Adversarial Attacks and Evaluate Attack Resilience in Hate-Speech Detection Models
The advent of social media has given rise to numerous ethical challenges,
with hate speech among the most significant concerns. Researchers are
attempting to tackle this problem by leveraging hate-speech detection and
employing language models to automatically moderate content and promote civil
discourse. Unfortunately, recent studies have revealed that hate-speech
detection systems can be misled by adversarial attacks, raising concerns about
their resilience. While previous research has separately addressed the
robustness of these models under adversarial attacks and their
interpretability, there has been no comprehensive study exploring their
intersection. The novelty of our work lies in combining these two critical
aspects, leveraging interpretability to identify potential vulnerabilities and
enabling the design of targeted adversarial attacks. We present a comprehensive
and comparative analysis of adversarial robustness exhibited by various
hate-speech detection models. Our study evaluates the resilience of these
models against adversarial attacks using explainability techniques. To gain
insights into the models' decision-making processes, we employ the Local
Interpretable Model-agnostic Explanations (LIME) framework. Based on the
explainability results obtained by LIME, we devise and execute targeted attacks
on the text by leveraging the TextAttack tool. Our findings enhance the
understanding of the vulnerabilities and strengths exhibited by
state-of-the-art hate-speech detection models. This work underscores the
importance of incorporating explainability in the development and evaluation of
such models to enhance their resilience against adversarial attacks.
Ultimately, this work paves the way for creating more robust and reliable
hate-speech detection systems, fostering safer online environments and
promoting ethical discourse on social media platforms
Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation
Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources
Indices Converting Resignation and Drop-Offs of Business Students to Retention
Each new generation brings a challenge to employers, university management and teachers with new attitudes affecting their continuous matriculation and degree completion. This article discusses how to retain both business and institutional career-oriented students using real-time communication based on their attitudes, emotions resulting from logically generated synonyms by automatic data evaluation by the information system. The objective of this article is to identify these students early in their academic studies and to assess their likelihood for continuous matriculation and ultimately increase retention rates. Using data from entry questionnaire during application at university, based on their attitudinal expectation, students were categorised into groups that affected their continuous matriculation. Data used in this study were gathered by compulsory entry questionnaire of 535 students in the academic year 2017-2018. Using statistical and dimensional analysis, four groups were identified among university applicants: Proactive, Reactive, Lazy and Institutional. Responses were tested according to Complementary Distribution Function (CDF) and normal distribution as Probabilistic Distribution Function (PDF). Antagonist attitudes were found for answers corresponding to PDF and CDF. Results indicate that business and institutionally oriented students should be separated and treated individually to increase retention
PANCETTA: Phoneme Aware Neural Completion to Elicit Tongue Twisters Automatically
Tongue twisters are meaningful sentences that are difficult to pronounce. The
process of automatically generating tongue twisters is challenging since the
generated utterance must satisfy two conditions at once: phonetic difficulty
and semantic meaning. Furthermore, phonetic difficulty is itself hard to
characterize and is expressed in natural tongue twisters through a
heterogeneous mix of phenomena such as alliteration and homophony. In this
paper, we propose PANCETTA: Phoneme Aware Neural Completion to Elicit Tongue
Twisters Automatically. We leverage phoneme representations to capture the
notion of phonetic difficulty, and we train language models to generate
original tongue twisters on two proposed task settings. To do this, we curate a
dataset called PANCETTA, consisting of existing English tongue twisters.
Through automatic and human evaluation, as well as qualitative analysis, we
show that PANCETTA generates novel, phonetically difficult, fluent, and
semantically meaningful tongue twisters.Comment: EACL 2023. Code at https://github.com/sedrickkeh/PANCETT
DuQM: A Chinese Dataset of Linguistically Perturbed Natural Questions for Evaluating the Robustness of Question Matching Models
In this paper, we focus on studying robustness evaluation of Chinese question
matching. Most of the previous work on analyzing robustness issue focus on just
one or a few types of artificial adversarial examples. Instead, we argue that
it is necessary to formulate a comprehensive evaluation about the linguistic
capabilities of models on natural texts. For this purpose, we create a Chinese
dataset namely DuQM which contains natural questions with linguistic
perturbations to evaluate the robustness of question matching models. DuQM
contains 3 categories and 13 subcategories with 32 linguistic perturbations.
The extensive experiments demonstrate that DuQM has a better ability to
distinguish different models. Importantly, the detailed breakdown of evaluation
by linguistic phenomenon in DuQM helps us easily diagnose the strength and
weakness of different models. Additionally, our experiment results show that
the effect of artificial adversarial examples does not work on the natural
texts
Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval
Although more and more language pairs are covered by machine translation
services, there are still many pairs that lack translation resources.
Cross-language information retrieval (CLIR) is an application which needs
translation functionality of a relatively low level of sophistication since
current models for information retrieval (IR) are still based on a
bag-of-words. The Web provides a vast resource for the automatic construction
of parallel corpora which can be used to train statistical translation models
automatically. The resulting translation models can be embedded in several ways
in a retrieval model. In this paper, we will investigate the problem of
automatically mining parallel texts from the Web and different ways of
integrating the translation models within the retrieval process. Our
experiments on standard test collections for CLIR show that the Web-based
translation models can surpass commercial MT systems in CLIR tasks. These
results open the perspective of constructing a fully automatic query
translation device for CLIR at a very low cost.Comment: 37 page
Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review
Deep Neural Networks (DNNs) have led to unprecedented progress in various
natural language processing (NLP) tasks. Owing to limited data and computation
resources, using third-party data and models has become a new paradigm for
adapting various tasks. However, research shows that it has some potential
security vulnerabilities because attackers can manipulate the training process
and data source. Such a way can set specific triggers, making the model exhibit
expected behaviors that have little inferior influence on the model's
performance for primitive tasks, called backdoor attacks. Hence, it could have
dire consequences, especially considering that the backdoor attack surfaces are
broad.
To get a precise grasp and understanding of this problem, a systematic and
comprehensive review is required to confront various security challenges from
different phases and attack purposes. Additionally, there is a dearth of
analysis and comparison of the various emerging backdoor countermeasures in
this situation. In this paper, we conduct a timely review of backdoor attacks
and countermeasures to sound the red alarm for the NLP security community.
According to the affected stage of the machine learning pipeline, the attack
surfaces are recognized to be wide and then formalized into three
categorizations: attacking pre-trained model with fine-tuning (APMF) or
prompt-tuning (APMP), and attacking final model with training (AFMT), where
AFMT can be subdivided into different attack aims. Thus, attacks under each
categorization are combed. The countermeasures are categorized into two general
classes: sample inspection and model inspection. Overall, the research on the
defense side is far behind the attack side, and there is no single defense that
can prevent all types of backdoor attacks. An attacker can intelligently bypass
existing defenses with a more invisible attack. ......Comment: 24 pages, 4 figure
- …