25,801 research outputs found
Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay
This paper introduces a novel method for learning how to play the most
difficult Atari 2600 games from the Arcade Learning Environment using deep
reinforcement learning. The proposed method, human checkpoint replay, consists
in using checkpoints sampled from human gameplay as starting points for the
learning process. This is meant to compensate for the difficulties of current
exploration strategies, such as epsilon-greedy, to find successful control
policies in games with sparse rewards. Like other deep reinforcement learning
architectures, our model uses a convolutional neural network that receives only
raw pixel inputs to estimate the state value function. We tested our method on
Montezuma's Revenge and Private Eye, two of the most challenging games from the
Atari platform. The results we obtained show a substantial improvement compared
to previous learning approaches, as well as over a random player. We also
propose a method for training deep reinforcement learning agents using human
gameplay experience, which we call human experience replay.Comment: 6 pages, 2 figures, EGPAI 2016 - Evaluating General Purpose AI,
workshop held in conjunction with ECAI 201
Estimate Exchange over Network is Good for Distributed Hard Thresholding Pursuit
We investigate an existing distributed algorithm for learning sparse signals
or data over networks. The algorithm is iterative and exchanges intermediate
estimates of a sparse signal over a network. This learning strategy using
exchange of intermediate estimates over the network requires a limited
communication overhead for information transmission. Our objective in this
article is to show that the strategy is good for learning in spite of limited
communication. In pursuit of this objective, we first provide a restricted
isometry property (RIP)-based theoretical analysis on convergence of the
iterative algorithm. Then, using simulations, we show that the algorithm
provides competitive performance in learning sparse signals vis-a-vis an
existing alternate distributed algorithm. The alternate distributed algorithm
exchanges more information including observations and system parameters
Computationally Efficient Deep Neural Network for Computed Tomography Image Reconstruction
Deep-neural-network-based image reconstruction has demonstrated promising
performance in medical imaging for under-sampled and low-dose scenarios.
However, it requires large amount of memory and extensive time for the
training. It is especially challenging to train the reconstruction networks for
three-dimensional computed tomography (CT) because of the high resolution of CT
images. The purpose of this work is to reduce the memory and time consumption
of the training of the reconstruction networks for CT to make it practical for
current hardware, while maintaining the quality of the reconstructed images.
We unrolled the proximal gradient descent algorithm for iterative image
reconstruction to finite iterations and replaced the terms related to the
penalty function with trainable convolutional neural networks (CNN). The
network was trained greedily iteration by iteration in the image-domain on
patches, which requires reasonable amount of memory and time on mainstream
graphics processing unit (GPU). To overcome the local-minimum problem caused by
greedy learning, we used deep UNet as the CNN and incorporated separable
quadratic surrogate with ordered subsets for data fidelity, so that the
solution could escape from easy local minimums and achieve better image
quality.
The proposed method achieved comparable image quality with state-of-the-art
neural network for CT image reconstruction on 2D sparse-view and limited-angle
problems on the low-dose CT challenge dataset.Comment: 33 pages, 14 figures, accepted by Medical Physic
Differentiable Greedy Networks
Optimal selection of a subset of items from a given set is a hard problem
that requires combinatorial optimization. In this paper, we propose a subset
selection algorithm that is trainable with gradient-based methods yet achieves
near-optimal performance via submodular optimization. We focus on the task of
identifying a relevant set of sentences for claim verification in the context
of the FEVER task. Conventional methods for this task look at sentences on
their individual merit and thus do not optimize the informativeness of
sentences as a set. We show that our proposed method which builds on the idea
of unfolding a greedy algorithm into a computational graph allows both
interpretability and gradient-based training. The proposed differentiable
greedy network (DGN) outperforms discrete optimization algorithms as well as
other baseline methods in terms of precision and recall.Comment: Work in progress and under revie
Learning for Active 3D Mapping
We propose an active 3D mapping method for depth sensors, which allow
individual control of depth-measuring rays, such as the newly emerging
solid-state lidars. The method simultaneously (i) learns to reconstruct a dense
3D occupancy map from sparse depth measurements, and (ii) optimizes the
reactive control of depth-measuring rays. To make the first step towards the
online control optimization, we propose a fast prioritized greedy algorithm,
which needs to update its cost function in only a small fraction of pos- sible
rays. The approximation ratio of the greedy algorithm is derived. An
experimental evaluation on the subset of the KITTI dataset demonstrates
significant improve- ment in the 3D map accuracy when learning-to-reconstruct
from sparse measurements is coupled with the optimization of depth-measuring
rays.Comment: ICCV 2017 (oral). See video:
https://www.youtube.com/watch?v=KNex0zjeGY
Parameter Space Noise for Exploration
Deep reinforcement learning (RL) methods generally engage in exploratory
behavior through noise injection in the action space. An alternative is to add
noise directly to the agent's parameters, which can lead to more consistent
exploration and a richer set of behaviors. Methods such as evolutionary
strategies use parameter perturbations, but discard all temporal structure in
the process and require significantly more samples. Combining parameter noise
with traditional RL methods allows to combine the best of both worlds. We
demonstrate that both off- and on-policy methods benefit from this approach
through experimental comparison of DQN, DDPG, and TRPO on high-dimensional
discrete action environments as well as continuous control tasks. Our results
show that RL with parameter noise learns more efficiently than traditional RL
with action space noise and evolutionary strategies individually.Comment: Updated to camera-ready ICLR submissio
Majorization Minimization Technique for Optimally Solving Deep Dictionary Learning
The concept of deep dictionary learning has been recently proposed. Unlike
shallow dictionary learning which learns single level of dictionary to
represent the data, it uses multiple layers of dictionaries. So far, the
problem could only be solved in a greedy fashion; this was achieved by learning
a single layer of dictionary in each stage where the coefficients from the
previous layer acted as inputs to the subsequent layer (only the first layer
used the training samples as inputs). This was not optimal; there was feedback
from shallower to deeper layers but not the other way. This work proposes an
optimal solution to deep dictionary learning whereby all the layers of
dictionaries are solved simultaneously. We employ the Majorization Minimization
approach. Experiments have been carried out on benchmark datasets; it shows
that optimal learning indeed improves over greedy piecemeal learning.
Comparison with other unsupervised deep learning tools (stacked denoising
autoencoder, deep belief network, contractive autoencoder and K-sparse
autoencoder) show that our method supersedes their performance both in accuracy
and speed
Unsupervised Deep Feature Extraction for Remote Sensing Image Classification
This paper introduces the use of single layer and deep convolutional networks
for remote sensing data analysis. Direct application to multi- and
hyper-spectral imagery of supervised (shallow or deep) convolutional networks
is very challenging given the high input data dimensionality and the relatively
small amount of available labeled data. Therefore, we propose the use of greedy
layer-wise unsupervised pre-training coupled with a highly efficient algorithm
for unsupervised learning of sparse features. The algorithm is rooted on sparse
representations and enforces both population and lifetime sparsity of the
extracted features, simultaneously. We successfully illustrate the expressive
power of the extracted representations in several scenarios: classification of
aerial scenes, as well as land-use classification in very high resolution
(VHR), or land-cover classification from multi- and hyper-spectral images. The
proposed algorithm clearly outperforms standard Principal Component Analysis
(PCA) and its kernel counterpart (kPCA), as well as current state-of-the-art
algorithms of aerial classification, while being extremely computationally
efficient at learning representations of data. Results show that single layer
convolutional networks can extract powerful discriminative features only when
the receptive field accounts for neighboring pixels, and are preferred when the
classification requires high resolution and detailed results. However, deep
architectures significantly outperform single layers variants, capturing
increasing levels of abstraction and complexity throughout the feature
hierarchy
Greedy Deep Dictionary Learning
In this work we propose a new deep learning tool called deep dictionary
learning. Multi-level dictionaries are learnt in a greedy fashion, one layer at
a time. This requires solving a simple (shallow) dictionary learning problem,
the solution to this is well known. We apply the proposed technique on some
benchmark deep learning datasets. We compare our results with other deep
learning tools like stacked autoencoder and deep belief network; and state of
the art supervised dictionary learning tools like discriminative KSVD and label
consistent KSVD. Our method yields better results than all
Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm
Learning Bayesian networks is often cast as an optimization problem, where
the computational task is to find a structure that maximizes a statistically
motivated score. By and large, existing learning tools address this
optimization problem using standard heuristic search techniques. Since the
search space is extremely large, such search procedures can spend most of the
time examining candidates that are extremely unreasonable. This problem becomes
critical when we deal with data sets that are large either in the number of
instances, or the number of attributes. In this paper, we introduce an
algorithm that achieves faster learning by restricting the search space. This
iterative algorithm restricts the parents of each variable to belong to a small
subset of candidates. We then search for a network that satisfies these
constraints. The learned network is then used for selecting better candidates
for the next iteration. We evaluate this algorithm both on synthetic and
real-life data. Our results show that it is significantly faster than
alternative search procedures without loss of quality in the learned
structures.Comment: Appears in Proceedings of the Fifteenth Conference on Uncertainty in
Artificial Intelligence (UAI1999
- …