3 research outputs found
Backdoor Learning for NLP: Recent Advances, Challenges, and Future Research Directions
Although backdoor learning is an active research topic in the NLP domain, the
literature lacks studies that systematically categorize and summarize backdoor
attacks and defenses. To bridge the gap, we present a comprehensive and
unifying study of backdoor learning for NLP by summarizing the literature in a
systematic manner. We first present and motivate the importance of backdoor
learning for building robust NLP systems. Next, we provide a thorough account
of backdoor attack techniques, their applications, defenses against backdoor
attacks, and various mitigation techniques to remove backdoor attacks. We then
provide a detailed review and analysis of evaluation metrics, benchmark
datasets, threat models, and challenges related to backdoor learning in NLP.
Ultimately, our work aims to crystallize and contextualize the landscape of
existing literature in backdoor learning for the text domain and motivate
further research in the field. To this end, we identify troubling gaps in the
literature and offer insights and ideas into open challenges and future
research directions. Finally, we provide a GitHub repository with a list of
backdoor learning papers that will be continuously updated at
https://github.com/marwanomar1/Backdoor-Learning-for-NLP
Polynomial-time targeted attacks on coin tossing for any number of corruptions
Consider an -message coin-tossing protocol between parties , in which broadcasts a single message in round (possibly based on the previously shared messages) and at the end they agree on bit . A -replacing adversary can change up to of the messages as follows. In every round , the adversary who knows all the messages broadcast so far, as well as a message that is prepared by to be just sent, can can to replace the prepared message with its own choice. A targeted adversary prefers the outcome , and its bias is defined as , where (resp. ) refers to the probability of outputting when the attack happens (resp. does not happen). In this work, we study -replacing targeted attacks, their computational efficiency, and optimality, for all .
Large messages: When the messages are allowed to be arbitrarily long, we show that polynomial-time -replacing targeted attacks can achieve bias for any (and any protocol), which is optimal up to a constant factor for any . Previously, it was known how to achieve such bias only for (Komargodski-Raz [DISC\u2718], Mahloujifar-Mahmoody [ALT\u2719], and Etesami-Mahloujifar-Mahmoody [SODA\u2720]). This proves a computational variant of the isoperimetric inequality for product spaces under Hamming distance. As a corollary, we also obtain improved -time targeted poisoning attacks on deterministic learners, in which the adversary can increase the probability of any efficiently testable bad event over the produced model from to by changing out of training examples.
Binary messages: When the messages are uniformly random bits, we show that if for is the probability of falling into a Hamming ball, then polynomial-time -replacing targeted attacks can achieve , which is optimal due to the simple majority protocol. Thus, as corollary we obtain an alternative proof of the Harper\u27s celebrated vertex isoperimetric inequality in which the optimal adversary (that maps random points to a set of measure by changing at most bits) is limited to be online and run in polynomial time. Previously, Lichtenstein, Linial, and Saks [Combinatorica\u2789] showed how to achieve (using computationally unbounded attacks), which is optimal for adaptive adversaries who decide on corrupting parties before seeing their messages