18,302 research outputs found
Faster Reinforcement Learning Using Active Simulators
In this work, we propose several online methods to build a \emph{learning
curriculum} from a given set of target-task-specific training tasks in order to
speed up reinforcement learning (RL). These methods can decrease the total
training time needed by an RL agent compared to training on the target task
from scratch. Unlike traditional transfer learning, we consider creating a
sequence from several training tasks in order to provide the most benefit in
terms of reducing the total time to train.
Our methods utilize the learning trajectory of the agent on the curriculum
tasks seen so far to decide which tasks to train on next. An attractive feature
of our methods is that they are weakly coupled to the choice of the RL
algorithm as well as the transfer learning method. Further, when there is
domain information available, our methods can incorporate such knowledge to
further speed up the learning. We experimentally show that these methods can be
used to obtain suitable learning curricula that speed up the overall training
time on two different domains.Comment: 12 pages and 4 figures More experiments added to the previous versio
Using Task Descriptions in Lifelong Machine Learning for Improved Performance and Zero-Shot Transfer
Knowledge transfer between tasks can improve the performance of learned
models, but requires an accurate estimate of the inter-task relationships to
identify the relevant knowledge to transfer. These inter-task relationships are
typically estimated based on training data for each task, which is inefficient
in lifelong learning settings where the goal is to learn each consecutive task
rapidly from as little data as possible. To reduce this burden, we develop a
lifelong learning method based on coupled dictionary learning that utilizes
high-level task descriptions to model the inter-task relationships. We show
that using task descriptors improves the performance of the learned task
policies, providing both theoretical justification for the benefit and
empirical demonstration of the improvement across a variety of learning
problems. Given only the descriptor for a new task, the lifelong learner is
also able to accurately predict a model for the new task through zero-shot
learning using the coupled dictionary, eliminating the need to gather training
data before addressing the task.Comment: 28 page
Moving Target Defense for Deep Visual Sensing against Adversarial Examples
Deep learning based visual sensing has achieved attractive accuracy but is
shown vulnerable to adversarial example attacks. Specifically, once the
attackers obtain the deep model, they can construct adversarial examples to
mislead the model to yield wrong classification results. Deployable adversarial
examples such as small stickers pasted on the road signs and lanes have been
shown effective in misleading advanced driver-assistance systems. Many existing
countermeasures against adversarial examples build their security on the
attackers' ignorance of the defense mechanisms. Thus, they fall short of
following Kerckhoffs's principle and can be subverted once the attackers know
the details of the defense. This paper applies the strategy of moving target
defense (MTD) to generate multiple new deep models after system deployment,
that will collaboratively detect and thwart adversarial examples. Our MTD
design is based on the adversarial examples' minor transferability to models
differing from the one (e.g., the factory-designed model) used for attack
construction. The post-deployment quasi-secret deep models significantly
increase the bar for the attackers to construct effective adversarial examples.
We also apply the technique of serial data fusion with early stopping to reduce
the inference time by a factor of up to 5 while maintaining the sensing and
defense performance. Extensive evaluation based on three datasets including a
road sign image database and a GPU-equipped Jetson embedded computing board
shows the effectiveness of our approach
Monocular Depth Estimators: Vulnerabilities and Attacks
Recent advancements of neural networks lead to reliable monocular depth
estimation. Monocular depth estimated techniques have the upper hand over
traditional depth estimation techniques as it only needs one image during
inference. Depth estimation is one of the essential tasks in robotics, and
monocular depth estimation has a wide variety of safety-critical applications
like in self-driving cars and surgical devices. Thus, the robustness of such
techniques is very crucial. It has been shown in recent works that these deep
neural networks are highly vulnerable to adversarial samples for tasks like
classification, detection and segmentation. These adversarial samples can
completely ruin the output of the system, making their credibility in real-time
deployment questionable. In this paper, we investigate the robustness of the
most state-of-the-art monocular depth estimation networks against adversarial
attacks. Our experiments show that tiny perturbations on an image that are
invisible to the naked eye (perturbation attack) and corruption less than about
1% of an image (patch attack) can affect the depth estimation drastically. We
introduce a novel deep feature annihilation loss that corrupts the hidden
feature space representation forcing the decoder of the network to output poor
depth maps. The white-box and black-box test compliments the effectiveness of
the proposed attack. We also perform adversarial example transferability tests,
mainly cross-data transferability
An Optimal Online Method of Selecting Source Policies for Reinforcement Learning
Transfer learning significantly accelerates the reinforcement learning
process by exploiting relevant knowledge from previous experiences. The problem
of optimally selecting source policies during the learning process is of great
importance yet challenging. There has been little theoretical analysis of this
problem. In this paper, we develop an optimal online method to select source
policies for reinforcement learning. This method formulates online source
policy selection as a multi-armed bandit problem and augments Q-learning with
policy reuse. We provide theoretical guarantees of the optimal selection
process and convergence to the optimal policy. In addition, we conduct
experiments on a grid-based robot navigation domain to demonstrate its
efficiency and robustness by comparing to the state-of-the-art transfer
learning method
Gotta Catch 'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks
Deep neural networks (DNN) are known to be vulnerable to adversarial attacks.
Numerous efforts either try to patch weaknesses in trained models, or try to
make it difficult or costly to compute adversarial examples that exploit them.
In our work, we explore a new "honeypot" approach to protect DNN models. We
intentionally inject trapdoors, honeypot weaknesses in the classification
manifold that attract attackers searching for adversarial examples. Attackers'
optimization algorithms gravitate towards trapdoors, leading them to produce
attacks similar to trapdoors in the feature space. Our defense then identifies
attacks by comparing neuron activation signatures of inputs to those of
trapdoors. In this paper, we introduce trapdoors and describe an implementation
of a trapdoor-enabled defense. First, we analytically prove that trapdoors
shape the computation of adversarial attacks so that attack inputs will have
feature representations very similar to those of trapdoors. Second, we
experimentally show that trapdoor-protected models can detect, with high
accuracy, adversarial examples generated by state-of-the-art attacks (PGD,
optimization-based CW, Elastic Net, BPDA), with negligible impact on normal
classification. These results generalize across classification domains,
including image, facial, and traffic-sign recognition. We also present
significant results measuring trapdoors' robustness against customized adaptive
attacks (countermeasures)
Learning Curriculum Policies for Reinforcement Learning
Curriculum learning in reinforcement learning is a training methodology that
seeks to speed up learning of a difficult target task, by first training on a
series of simpler tasks and transferring the knowledge acquired to the target
task. Automatically choosing a sequence of such tasks (i.e. a curriculum) is an
open problem that has been the subject of much recent work in this area. In
this paper, we build upon a recent method for curriculum design, which
formulates the curriculum sequencing problem as a Markov Decision Process. We
extend this model to handle multiple transfer learning algorithms, and show for
the first time that a curriculum policy over this MDP can be learned from
experience. We explore various representations that make this possible, and
evaluate our approach by learning curriculum policies for multiple agents in
two different domains. The results show that our method produces curricula that
can train agents to perform on a target task as fast or faster than existing
methods
Adversarial Examples - A Complete Characterisation of the Phenomenon
We provide a complete characterisation of the phenomenon of adversarial
examples - inputs intentionally crafted to fool machine learning models. We aim
to cover all the important concerns in this field of study: (1) the conjectures
on the existence of adversarial examples, (2) the security, safety and
robustness implications, (3) the methods used to generate and (4) protect
against adversarial examples and (5) the ability of adversarial examples to
transfer between different machine learning models. We provide ample background
information in an effort to make this document self-contained. Therefore, this
document can be used as survey, tutorial or as a catalog of attacks and
defences using adversarial examples
Towards a Robust Deep Neural Network in Texts: A Survey
Deep neural networks (DNNs) have achieved remarkable success in various tasks
(e.g., image classification, speech recognition, and natural language
processing). However, researches have shown that DNN models are vulnerable to
adversarial examples, which cause incorrect predictions by adding imperceptible
perturbations into normal inputs. Studies on adversarial examples in image
domain have been well investigated, but in texts the research is not enough,
let alone a comprehensive survey in this field. In this paper, we aim at
presenting a comprehensive understanding of adversarial attacks and
corresponding mitigation strategies in texts. Specifically, we first give a
taxonomy of adversarial attacks and defenses in texts from the perspective of
different natural language processing (NLP) tasks, and then introduce how to
build a robust DNN model via testing and verification. Finally, we discuss the
existing challenges of adversarial attacks and defenses in texts and present
the future research directions in this emerging field
Land-Cover Classification with High-Resolution Remote Sensing Images Using Transferable Deep Models
In recent years, large amount of high spatial-resolution remote sensing
(HRRS) images are available for land-cover mapping. However, due to the complex
information brought by the increased spatial resolution and the data
disturbances caused by different conditions of image acquisition, it is often
difficult to find an efficient method for achieving accurate land-cover
classification with high-resolution and heterogeneous remote sensing images. In
this paper, we propose a scheme to apply deep model obtained from labeled
land-cover dataset to classify unlabeled HRRS images. The main idea is to rely
on deep neural networks for presenting the contextual information contained in
different types of land-covers and propose a pseudo-labeling and sample
selection scheme for improving the transferability of deep models. More
precisely, a deep Convolutional Neural Networks is first pre-trained with a
well-annotated land-cover dataset, referred to as the source data. Then, given
a target image with no labels, the pre-trained CNN model is utilized to
classify the image in a patch-wise manner. The patches with high confidence are
assigned with pseudo-labels and employed as the queries to retrieve related
samples from the source data. The pseudo-labels confirmed with the retrieved
results are regarded as supervised information for fine-tuning the pre-trained
deep model. To obtain a pixel-wise land-cover classification with the target
image, we rely on the fine-tuned CNN and develop a hybrid classification by
combining patch-wise classification and hierarchical segmentation. In addition,
we create a large-scale land-cover dataset containing 150 Gaofen-2 satellite
images for CNN pre-training. Experiments on multi-source HRRS images show
encouraging results and demonstrate the applicability of the proposed scheme to
land-cover classification
- …