22 research outputs found
Spectral Signatures in Backdoor Attacks
A recent line of work has uncovered a new form of data poisoning: so-called
\emph{backdoor} attacks. These attacks are particularly dangerous because they
do not affect a network's behavior on typical, benign data. Rather, the network
only deviates from its expected output when triggered by a perturbation planted
by an adversary.
In this paper, we identify a new property of all known backdoor attacks,
which we call \emph{spectral signatures}. This property allows us to utilize
tools from robust statistics to thwart the attacks. We demonstrate the efficacy
of these signatures in detecting and removing poisoned examples on real image
sets and state of the art neural network architectures. We believe that
understanding spectral signatures is a crucial first step towards designing ML
systems secure against such backdoor attacksComment: 16 pages, accepted to NIPS 201
Data Poisoning Attacks in Contextual Bandits
We study offline data poisoning attacks in contextual bandits, a class of
reinforcement learning problems with important applications in online
recommendation and adaptive medical treatment, among others. We provide a
general attack framework based on convex optimization and show that by slightly
manipulating rewards in the data, an attacker can force the bandit algorithm to
pull a target arm for a target contextual vector. The target arm and target
contextual vector are both chosen by the attacker. That is, the attacker can
hijack the behavior of a contextual bandit. We also investigate the feasibility
and the side effects of such attacks, and identify future directions for
defense. Experiments on both synthetic and real-world data demonstrate the
efficiency of the attack algorithm.Comment: GameSec 201
Multi-modal gated recurrent units for image description
Using a natural language sentence to describe the content of an image is a
challenging but very important task. It is challenging because a description
must not only capture objects contained in the image and the relationships
among them, but also be relevant and grammatically correct. In this paper a
multi-modal embedding model based on gated recurrent units (GRU) which can
generate variable-length description for a given image. In the training step,
we apply the convolutional neural network (CNN) to extract the image feature.
Then the feature is imported into the multi-modal GRU as well as the
corresponding sentence representations. The multi-modal GRU learns the
inter-modal relations between image and sentence. And in the testing step, when
an image is imported to our multi-modal GRU model, a sentence which describes
the image content is generated. The experimental results demonstrate that our
multi-modal GRU model obtains the state-of-the-art performance on Flickr8K,
Flickr30K and MS COCO datasets.Comment: 25 pages, 7 figures, 6 tables, magazin