12 research outputs found
Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization
Solving for adversarial examples with projected gradient descent has been
demonstrated to be highly effective in fooling the neural network based
classifiers. However, in the black-box setting, the attacker is limited only to
the query access to the network and solving for a successful adversarial
example becomes much more difficult. To this end, recent methods aim at
estimating the true gradient signal based on the input queries but at the cost
of excessive queries. We propose an efficient discrete surrogate to the
optimization problem which does not require estimating the gradient and
consequently becomes free of the first order update hyperparameters to tune.
Our experiments on Cifar-10 and ImageNet show the state of the art black-box
attack performance with significant reduction in the required queries compared
to a number of recently proposed methods. The source code is available at
https://github.com/snu-mllab/parsimonious-blackbox-attack.Comment: Accepted and to appear at ICML 201
Direct Preference-based Policy Optimization without Reward Modeling
Preference-based reinforcement learning (PbRL) is an approach that enables RL
agents to learn from preference, which is particularly useful when formulating
a reward function is challenging. Existing PbRL methods generally involve a
two-step procedure: they first learn a reward model based on given preference
data and then employ off-the-shelf reinforcement learning algorithms using the
learned reward model. However, obtaining an accurate reward model solely from
preference information, especially when the preference is from human teachers,
can be difficult. Instead, we propose a PbRL algorithm that directly learns
from preference without requiring any reward modeling. To achieve this, we
adopt a contrastive learning framework to design a novel policy scoring metric
that assigns a high score to policies that align with the given preferences. We
apply our algorithm to offline RL tasks with actual human preference labels and
show that our algorithm outperforms or is on par with the existing PbRL
methods. Notably, on high-dimensional control tasks, our algorithm surpasses
offline RL methods that learn with ground-truth reward information. Finally, we
show that our algorithm can be successfully applied to fine-tune large language
models.Comment: NeurIPS 202
T-DNA insertion mutants reveal complex expression patterns of the aldehyde dehydrogenase 3H1 locus in Arabidopsis thaliana
The Arabidopsis thaliana aldehyde dehydrogenase 3H1 gene (ALDH3H1; AT1G44170) belongs to family 3 of the plant aldehyde dehydrogenase superfamily. The full-length transcript of the corresponding gene comprises an open reading frame of 1583 bp and encodes a protein of 484 amino acid residues. Gene expression studies have shown that this transcript accumulates mainly in the roots of 4-week-old plants following abscisic acid, dehydration, and NaCl treatments. The current study provided experimental data that the ALDH3H1 locus generates at least five alternative transcript variants in addition to the previously described ALDH3H1 mRNA. The alternative transcripts accumulated in wild-type plants at a low level but were upregulated in a mutant that carried a T-DNA insertion in the first exon of the gene. Expression of the transcript isoforms involved alternative gene splicing combined with an alternative promoter. The transcript isoforms were differentially expressed in the roots and shoots and showed developmental stage- and tissue-specific expression patterns. These data support the hypothesis that alternative isoforms produced by gene splicing or alternative promoters regulate the abundance of the constitutively spliced and functional variants
Parsimonious black-box adversarial attacks via efficient combinatorial optimization
Ā© 2019 by the author(s).Solving for adversarial examples with projected gradient descent has been demonstrated to be highly effective in fooling the neural network based classifiers. However, in the black-box setting, the attacker is limited only to the query access to the network and solving for a successful adversarial example becomes much more difficult. To this end, recent methods aim at estimating the true gradient signal based on the input queries but at the cost of excessive queries. We propose an efficient discrete surrogate to the optimization problem which does not require estimating the gradient and consequently becomes free of the first order update hyperparameters to tune. Our experiments on Cifar-10 and ImageNet show the state of the art black-box attack performance with significant reduction in the required queries compared to a number of recently proposed methods.N
Preemptive Image Robustification for Protecting Users against Man-in-the-Middle Adversarial Attacks
Deep neural networks have become the driving force of modern image recognition systems. However, the vulnerability of neural networks against adversarial attacks poses a serious threat to the people affected by these systems. In this paper, we focus on a real-world threat model where a Man-in-the-Middle adversary maliciously intercepts and perturbs images web users upload online. This type of attack can raise severe ethical concerns on top of simple performance degradation. To prevent this attack, we devise a novel bi-level optimization algorithm that finds points in the vicinity of natural images that are robust to adversarial perturbations. Experiments on CIFAR-10 and ImageNet show our method can effectively robustify natural images within the given modification budget. We also show the proposed method can improve robustness when jointly used with randomized smoothing