Search CORE

63,884 research outputs found

Reevaluating Adversarial Examples in Natural Language

Author: Ji Yangfeng
Lanchantin Jack
Lifland Eli
Morris John X.
Qi Yanjun
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

State-of-the-art attacks on NLP models lack a shared definition of a what constitutes a successful attack. We distill ideas from past work into a unified framework: a successful natural language adversarial example is a perturbation that fools the model and follows some linguistic constraints. We then analyze the outputs of two state-of-the-art synonym substitution attacks. We find that their perturbations often do not preserve semantics, and 38% introduce grammatical errors. Human surveys reveal that to successfully preserve semantics, we need to significantly increase the minimum cosine similarities between the embeddings of swapped words and between the sentence encodings of original and perturbed sentences.With constraints adjusted to better preserve semantics and grammaticality, the attack success rate drops by over 70 percentage points.Comment: 15 pages; 9 Tables; 5 Figure

arXiv.org e-Print Archive

Crossref

"Not not bad" is not "bad": A distributional account of negation

Author: Blunsom Phil
Grefenstette Edward
Hermann Karl Moritz
Publication venue
Publication date: 01/01/2013
Field of study

With the increasing empirical success of distributional models of compositional semantics, it is timely to consider the types of textual logic that such models are capable of capturing. In this paper, we address shortcomings in the ability of current models to capture logical operations such as negation. As a solution we propose a tripartite formulation for a continuous vector space representation of semantics and subsequently use this representation to develop a formal compositional notion of negation within such models.Comment: 9 pages, to appear in Proceedings of the 2013 Workshop on Continuous Vector Space Models and their Compositionalit

arXiv.org e-Print Archive

CiteSeerX

Oxford University Research Archive