3 research outputs found

    Backdoor Learning for NLP: Recent Advances, Challenges, and Future Research Directions

    Full text link
    Although backdoor learning is an active research topic in the NLP domain, the literature lacks studies that systematically categorize and summarize backdoor attacks and defenses. To bridge the gap, we present a comprehensive and unifying study of backdoor learning for NLP by summarizing the literature in a systematic manner. We first present and motivate the importance of backdoor learning for building robust NLP systems. Next, we provide a thorough account of backdoor attack techniques, their applications, defenses against backdoor attacks, and various mitigation techniques to remove backdoor attacks. We then provide a detailed review and analysis of evaluation metrics, benchmark datasets, threat models, and challenges related to backdoor learning in NLP. Ultimately, our work aims to crystallize and contextualize the landscape of existing literature in backdoor learning for the text domain and motivate further research in the field. To this end, we identify troubling gaps in the literature and offer insights and ideas into open challenges and future research directions. Finally, we provide a GitHub repository with a list of backdoor learning papers that will be continuously updated at https://github.com/marwanomar1/Backdoor-Learning-for-NLP

    Polynomial-time targeted attacks on coin tossing for any number of corruptions

    Get PDF
    Consider an nn-message coin-tossing protocol between nn parties P1,,PnP_1,\dots,P_n, in which PiP_i broadcasts a single message wiw_i in round ii (possibly based on the previously shared messages) and at the end they agree on bit bb. A kk-replacing adversary AkA_k can change up to kk of the messages as follows. In every round ii, the adversary who knows all the messages broadcast so far, as well as a message wiw_i that is prepared by PiP_i to be just sent, can can to replace the prepared message wiw_i with its own choice. A targeted adversary prefers the outcome b2˘7=1b\u27=1, and its bias is defined as μ2˘7μ\mu\u27-\mu, where μ2˘7=Pr[b2˘7=1]\mu\u27=\Pr[b\u27=1] (resp. Pr[b=1]=μ\Pr[b=1]=\mu) refers to the probability of outputting 11 when the attack happens (resp. does not happen). In this work, we study kk-replacing targeted attacks, their computational efficiency, and optimality, for all k[n]k \in [n]. Large messages: When the messages are allowed to be arbitrarily long, we show that polynomial-time kk-replacing targeted attacks can achieve bias Ω(μk/n)\Omega(\mu k/\sqrt n) for any kk (and any protocol), which is optimal up to a constant factor for any μ=Θ(1)\mu = \Theta(1). Previously, it was known how to achieve such bias only for k=Ω(n)k = \Omega(\sqrt n) (Komargodski-Raz [DISC\u2718], Mahloujifar-Mahmoody [ALT\u2719], and Etesami-Mahloujifar-Mahmoody [SODA\u2720]). This proves a computational variant of the isoperimetric inequality for product spaces under k=o(n)k=o(\sqrt n) Hamming distance. As a corollary, we also obtain improved poly(n)poly(n)-time targeted poisoning attacks on deterministic learners, in which the adversary can increase the probability of any efficiently testable bad event over the produced model from μ=1/poly(n)\mu=1/poly(n) to μ+Ω(μk/n)\mu + \Omega(\mu k /\sqrt n) by changing kk out of nn training examples. Binary messages: When the messages w1,,wnw_1,\dots,w_n are uniformly random bits, we show that if μ=Pr[b=1]=Pr[wit]=βn(t)\mu=\Pr[b=1]= \Pr[\sum w_i \geq t] = \beta^{(t)}_n for t[n]t \in [n] is the probability of falling into a Hamming ball, then polynomial-time kk-replacing targeted attacks can achieve μ2˘7=Pr[b2˘7=1]=βn(tk)\mu\u27=\Pr[b\u27=1]=\beta^{(t-k)}_n , which is optimal due to the simple majority protocol. Thus, as corollary we obtain an alternative proof of the Harper\u27s celebrated vertex isoperimetric inequality in which the optimal adversary (that maps random points to a set of measure μ\mu by changing at most kk bits) is limited to be online and run in polynomial time. Previously, Lichtenstein, Linial, and Saks [Combinatorica\u2789] showed how to achieve μ2˘7=Pr[b2˘7=1]=βnk(tk)\mu\u27=\Pr[b\u27=1] = \beta^{(t-k)}_{ n-k } (using computationally unbounded attacks), which is optimal for adaptive adversaries who decide on corrupting parties before seeing their messages
    corecore