Search CORE

82,792 research outputs found

Predictive-State Decoders: Encoding the Future into Recurrent Networks

Author: Venkatraman Arun
Rhinehart Nicholas
Sun Wen
Pinto Lerrel
Hebert Martial
Boots Byron
Kitani Kris M.
Bagnell J. Andrew
Publication venue
Publication date: 05/07/2017
Field of study

Recurrent neural networks (RNNs) are a vital modeling technique that rely on internal states learned indirectly by optimization of a supervised, unsupervised, or reinforcement training loss. RNNs are used to model dynamic processes that are characterized by underlying latent states whose form is often unknown, precluding its analytic representation inside an RNN. In the Predictive-State Representation (PSR) literature, latent state processes are modeled by an internal state representation that directly models the distribution of future observations, and most recent work in this area has relied on explicitly representing and targeting sufficient statistics of this probability distribution. We seek to combine the advantages of RNNs and PSRs by augmenting existing state-of-the-art recurrent neural networks with Predictive-State Decoders (PSDs), which add supervision to the network's internal state representation to target predicting future observations. Predictive-State Decoders are simple to implement and easily incorporated into existing training pipelines via additional loss regularization. We demonstrate the effectiveness of PSDs with experimental results in three different domains: probabilistic filtering, Imitation Learning, and Reinforcement Learning. In each, our method improves statistical performance of state-of-the-art recurrent baselines and does so with fewer iterations and less data.Comment: NIPS 201

arXiv.org e-Print Archive

Dryad Digital Repository (Duke University)

FigShare

Do optimization methods in deep learning applications matter?

Author: Kiran Mariam
Ozyildirim Buse Melis
Publication venue: eScholarship, University of California
Publication date: 28/02/2020
Field of study

With advances in deep learning, exponential data growth and increasing model complexity, developing efficient optimization methods are attracting much research attention. Several implementations favor the use of Conjugate Gradient (CG) and Stochastic Gradient Descent (SGD) as being practical and elegant solutions to achieve quick convergence, however, these optimization processes also present many limitations in learning across deep learning applications. Recent research is exploring higher-order optimization functions as better approaches, but these present very complex computational challenges for practical use. Comparing first and higher-order optimization functions, in this paper, our experiments reveal that Levemberg-Marquardt (LM) significantly supersedes optimal convergence but suffers from very large processing time increasing the training complexity of both, classification and reinforcement learning problems. Our experiments compare off-the-shelf optimization functions(CG, SGD, LM and L-BFGS) in standard CIFAR, MNIST, CartPole and FlappyBird experiments.The paper presents arguments on which optimization functions to use and further, which functions would benefit from parallelization efforts to improve pretraining time and learning rate convergence

arXiv.org e-Print Archive

eScholarship - University of California

Show, Attend and Interact: Perceivable Human-Robot Social Interaction through Neural Attention Q-Network

Author: Ishiguro Hiroshi
Nakamura Yutaka
Qureshi Ahmed Hussain
Yoshikawa Yuichiro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/02/2017
Field of study

For a safe, natural and effective human-robot social interaction, it is essential to develop a system that allows a robot to demonstrate the perceivable responsive behaviors to complex human behaviors. We introduce the Multimodal Deep Attention Recurrent Q-Network using which the robot exhibits human-like social interaction skills after 14 days of interacting with people in an uncontrolled real world. Each and every day during the 14 days, the system gathered robot interaction experiences with people through a hit-and-trial method and then trained the MDARQN on these experiences using end-to-end reinforcement learning approach. The results of interaction based learning indicate that the robot has learned to respond to complex human behaviors in a perceivable and socially acceptable manner.Comment: 7 pages, 5 figures, accepted by IEEE-RAS ICRA'1

arXiv.org e-Print Archive

Crossref