Search CORE

25,801 research outputs found

Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay

Author: Hosu Ionel-Alexandru
Rebedea Traian
Publication venue
Publication date: 18/07/2016
Field of study

This paper introduces a novel method for learning how to play the most difficult Atari 2600 games from the Arcade Learning Environment using deep reinforcement learning. The proposed method, human checkpoint replay, consists in using checkpoints sampled from human gameplay as starting points for the learning process. This is meant to compensate for the difficulties of current exploration strategies, such as epsilon-greedy, to find successful control policies in games with sparse rewards. Like other deep reinforcement learning architectures, our model uses a convolutional neural network that receives only raw pixel inputs to estimate the state value function. We tested our method on Montezuma's Revenge and Private Eye, two of the most challenging games from the Atari platform. The results we obtained show a substantial improvement compared to previous learning approaches, as well as over a random player. We also propose a method for training deep reinforcement learning agents using human gameplay experience, which we call human experience replay.Comment: 6 pages, 2 figures, EGPAI 2016 - Evaluating General Purpose AI, workshop held in conjunction with ECAI 201

arXiv.org e-Print Archive

Estimate Exchange over Network is Good for Distributed Hard Thresholding Pursuit

Author: Chatterjee Saikat
Mitra Partha P.
Rasmussen Lars K.
Zaki Ahmed
Publication venue
Publication date: 22/09/2017
Field of study

We investigate an existing distributed algorithm for learning sparse signals or data over networks. The algorithm is iterative and exchanges intermediate estimates of a sparse signal over a network. This learning strategy using exchange of intermediate estimates over the network requires a limited communication overhead for information transmission. Our objective in this article is to show that the strategy is good for learning in spite of limited communication. In pursuit of this objective, we first provide a restricted isometry property (RIP)-based theoretical analysis on convergence of the iterative algorithm. Then, using simulations, we show that the algorithm provides competitive performance in learning sparse signals vis-a-vis an existing alternate distributed algorithm. The alternate distributed algorithm exchanges more information including observations and system parameters

arXiv.org e-Print Archive

Computationally Efficient Deep Neural Network for Computed Tomography Image Reconstruction

Author: Kim Kyungsang
Li Quanzheng
Wu Dufan
Publication venue: 'Wiley'
Publication date: 23/05/2019
Field of study

Deep-neural-network-based image reconstruction has demonstrated promising performance in medical imaging for under-sampled and low-dose scenarios. However, it requires large amount of memory and extensive time for the training. It is especially challenging to train the reconstruction networks for three-dimensional computed tomography (CT) because of the high resolution of CT images. The purpose of this work is to reduce the memory and time consumption of the training of the reconstruction networks for CT to make it practical for current hardware, while maintaining the quality of the reconstructed images. We unrolled the proximal gradient descent algorithm for iterative image reconstruction to finite iterations and replaced the terms related to the penalty function with trainable convolutional neural networks (CNN). The network was trained greedily iteration by iteration in the image-domain on patches, which requires reasonable amount of memory and time on mainstream graphics processing unit (GPU). To overcome the local-minimum problem caused by greedy learning, we used deep UNet as the CNN and incorporated separable quadratic surrogate with ordered subsets for data fidelity, so that the solution could escape from easy local minimums and achieve better image quality. The proposed method achieved comparable image quality with state-of-the-art neural network for CT image reconstruction on 2D sparse-view and limited-angle problems on the low-dose CT challenge dataset.Comment: 33 pages, 14 figures, accepted by Medical Physic

arXiv.org e-Print Archive

Differentiable Greedy Networks

Author: Fakoor Rasool
Kainth Amanjit
Mohamed Abdel-rahman
Powers Thomas
Sarikaya Ruhi
Sethy Abhinav
Shakeri Siamak
Publication venue
Publication date: 29/10/2018
Field of study

Optimal selection of a subset of items from a given set is a hard problem that requires combinatorial optimization. In this paper, we propose a subset selection algorithm that is trainable with gradient-based methods yet achieves near-optimal performance via submodular optimization. We focus on the task of identifying a relevant set of sentences for claim verification in the context of the FEVER task. Conventional methods for this task look at sentences on their individual merit and thus do not optimize the informativeness of sentences as a set. We show that our proposed method which builds on the idea of unfolding a greedy algorithm into a computational graph allows both interpretability and gradient-based training. The proposed differentiable greedy network (DGN) outperforms discrete optimization algorithms as well as other baseline methods in terms of precision and recall.Comment: Work in progress and under revie

arXiv.org e-Print Archive

Learning for Active 3D Mapping

Author: Petricek Tomas
Salansky Vojtech
Svoboda Tomas
Zimmermann Karel
Publication venue
Publication date: 07/08/2017
Field of study

We propose an active 3D mapping method for depth sensors, which allow individual control of depth-measuring rays, such as the newly emerging solid-state lidars. The method simultaneously (i) learns to reconstruct a dense 3D occupancy map from sparse depth measurements, and (ii) optimizes the reactive control of depth-measuring rays. To make the first step towards the online control optimization, we propose a fast prioritized greedy algorithm, which needs to update its cost function in only a small fraction of pos- sible rays. The approximation ratio of the greedy algorithm is derived. An experimental evaluation on the subset of the KITTI dataset demonstrates significant improve- ment in the 3D map accuracy when learning-to-reconstruct from sparse measurements is coupled with the optimization of depth-measuring rays.Comment: ICCV 2017 (oral). See video: https://www.youtube.com/watch?v=KNex0zjeGY

arXiv.org e-Print Archive

Parameter Space Noise for Exploration

Author: Abbeel Pieter
Andrychowicz Marcin
Asfour Tamim
Chen Richard Y.
Chen Xi
Dhariwal Prafulla
Houthooft Rein
Plappert Matthias
Sidor Szymon
Publication venue
Publication date: 31/01/2018
Field of study

Deep reinforcement learning (RL) methods generally engage in exploratory behavior through noise injection in the action space. An alternative is to add noise directly to the agent's parameters, which can lead to more consistent exploration and a richer set of behaviors. Methods such as evolutionary strategies use parameter perturbations, but discard all temporal structure in the process and require significantly more samples. Combining parameter noise with traditional RL methods allows to combine the best of both worlds. We demonstrate that both off- and on-policy methods benefit from this approach through experimental comparison of DQN, DDPG, and TRPO on high-dimensional discrete action environments as well as continuous control tasks. Our results show that RL with parameter noise learns more efficiently than traditional RL with action space noise and evolutionary strategies individually.Comment: Updated to camera-ready ICLR submissio

arXiv.org e-Print Archive

Majorization Minimization Technique for Optimally Solving Deep Dictionary Learning

Author: Majumdar Angshul
Singhal Vanika
Publication venue
Publication date: 11/12/2019
Field of study

The concept of deep dictionary learning has been recently proposed. Unlike shallow dictionary learning which learns single level of dictionary to represent the data, it uses multiple layers of dictionaries. So far, the problem could only be solved in a greedy fashion; this was achieved by learning a single layer of dictionary in each stage where the coefficients from the previous layer acted as inputs to the subsequent layer (only the first layer used the training samples as inputs). This was not optimal; there was feedback from shallower to deeper layers but not the other way. This work proposes an optimal solution to deep dictionary learning whereby all the layers of dictionaries are solved simultaneously. We employ the Majorization Minimization approach. Experiments have been carried out on benchmark datasets; it shows that optimal learning indeed improves over greedy piecemeal learning. Comparison with other unsupervised deep learning tools (stacked denoising autoencoder, deep belief network, contractive autoencoder and K-sparse autoencoder) show that our method supersedes their performance both in accuracy and speed

arXiv.org e-Print Archive

Unsupervised Deep Feature Extraction for Remote Sensing Image Classification

Author: Camps-Valls Gustau
Gatta Carlo
Romero Adriana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/11/2015
Field of study

This paper introduces the use of single layer and deep convolutional networks for remote sensing data analysis. Direct application to multi- and hyper-spectral imagery of supervised (shallow or deep) convolutional networks is very challenging given the high input data dimensionality and the relatively small amount of available labeled data. Therefore, we propose the use of greedy layer-wise unsupervised pre-training coupled with a highly efficient algorithm for unsupervised learning of sparse features. The algorithm is rooted on sparse representations and enforces both population and lifetime sparsity of the extracted features, simultaneously. We successfully illustrate the expressive power of the extracted representations in several scenarios: classification of aerial scenes, as well as land-use classification in very high resolution (VHR), or land-cover classification from multi- and hyper-spectral images. The proposed algorithm clearly outperforms standard Principal Component Analysis (PCA) and its kernel counterpart (kPCA), as well as current state-of-the-art algorithms of aerial classification, while being extremely computationally efficient at learning representations of data. Results show that single layer convolutional networks can extract powerful discriminative features only when the receptive field accounts for neighboring pixels, and are preferred when the classification requires high resolution and detailed results. However, deep architectures significantly outperform single layers variants, capturing increasing levels of abstraction and complexity throughout the feature hierarchy

arXiv.org e-Print Archive

Greedy Deep Dictionary Learning

Author: Majumdar Angshul
Singh Richa
Tariyal Snigdha
Vatsa Mayank
Publication venue
Publication date: 31/01/2016
Field of study

In this work we propose a new deep learning tool called deep dictionary learning. Multi-level dictionaries are learnt in a greedy fashion, one layer at a time. This requires solving a simple (shallow) dictionary learning problem, the solution to this is well known. We apply the proposed technique on some benchmark deep learning datasets. We compare our results with other deep learning tools like stacked autoencoder and deep belief network; and state of the art supervised dictionary learning tools like discriminative KSVD and label consistent KSVD. Our method yields better results than all

arXiv.org e-Print Archive

Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm

Author: Friedman Nir
Nachman Iftach
Pe'er Dana
Publication venue
Publication date: 23/01/2013
Field of study

Learning Bayesian networks is often cast as an optimization problem, where the computational task is to find a structure that maximizes a statistically motivated score. By and large, existing learning tools address this optimization problem using standard heuristic search techniques. Since the search space is extremely large, such search procedures can spend most of the time examining candidates that are extremely unreasonable. This problem becomes critical when we deal with data sets that are large either in the number of instances, or the number of attributes. In this paper, we introduce an algorithm that achieves faster learning by restricting the search space. This iterative algorithm restricts the parents of each variable to belong to a small subset of candidates. We then search for a network that satisfies these constraints. The learned network is then used for selecting better candidates for the next iteration. We evaluate this algorithm both on synthetic and real-life data. Our results show that it is significantly faster than alternative search procedures without loss of quality in the learned structures.Comment: Appears in Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI1999

arXiv.org e-Print Archive