Search CORE

32,661 research outputs found

Solutions of max-plus linear equations and large deviations

Author: Akian Marianne
Gaubert Stephane
Kolokoltsov Vassili
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

We generalise the Gartner-Ellis theorem of large deviations theory. Our results allow us to derive large deviation type results in stochastic optimal control from the convergence of generalised logarithmic moment generating functions. They rely on the characterisation of the uniqueness of the solutions of max-plus linear equations. We give an illustration for a simple investment model, in which logarithmic moment generating functions represent risk-sensitive values.Comment: 6 page

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Do optimization methods in deep learning applications matter?

Author: Kiran Mariam
Ozyildirim Buse Melis
Publication venue: eScholarship, University of California
Publication date: 28/02/2020
Field of study

With advances in deep learning, exponential data growth and increasing model complexity, developing efficient optimization methods are attracting much research attention. Several implementations favor the use of Conjugate Gradient (CG) and Stochastic Gradient Descent (SGD) as being practical and elegant solutions to achieve quick convergence, however, these optimization processes also present many limitations in learning across deep learning applications. Recent research is exploring higher-order optimization functions as better approaches, but these present very complex computational challenges for practical use. Comparing first and higher-order optimization functions, in this paper, our experiments reveal that Levemberg-Marquardt (LM) significantly supersedes optimal convergence but suffers from very large processing time increasing the training complexity of both, classification and reinforcement learning problems. Our experiments compare off-the-shelf optimization functions(CG, SGD, LM and L-BFGS) in standard CIFAR, MNIST, CartPole and FlappyBird experiments.The paper presents arguments on which optimization functions to use and further, which functions would benefit from parallelization efforts to improve pretraining time and learning rate convergence

arXiv.org e-Print Archive

eScholarship - University of California

Transport of Mars-Crossing Asteroids from the Quasi-Hilda Region

Author: Dellnitz Michael
Junge Oliver
Lo Martin W.
Marsden Jerrold E.
Padberg Kathrin
Preis Robert
Ross Shane D.
Thiere Bianca
Publication venue: 'American Physical Society (APS)'
Publication date: 17/06/2005
Field of study

We employ set oriented methods in combination with graph partitioning algorithms to identify key dynamical regions in the Sun-Jupiter-particle three-body system. Transport rates from a region near the 3:2 Hilda resonance into the realm of orbits crossing Mars' orbit are computed. In contrast to common numerical approaches, our technique does not depend on single long term simulations of the underlying model. Thus, our statistical results are particularly reliable since they are not affected by a dynamical behavior which is almost nonergodic (i.e., dominated by strongly almost invariant sets)

Caltech Authors

Counterfactual Risk Minimization: Learning from Logged Bandit Feedback

Author: Joachims Thorsten
Swaminathan Adith
Publication venue
Publication date: 20/05/2015
Field of study

We develop a learning principle and an efficient algorithm for batch learning from logged bandit feedback. This learning setting is ubiquitous in online systems (e.g., ad placement, web search, recommendation), where an algorithm makes a prediction (e.g., ad ranking) for a given input (e.g., query) and observes bandit feedback (e.g., user clicks on presented ads). We first address the counterfactual nature of the learning problem through propensity scoring. Next, we prove generalization error bounds that account for the variance of the propensity-weighted empirical risk estimator. These constructive bounds give rise to the Counterfactual Risk Minimization (CRM) principle. We show how CRM can be used to derive a new learning method -- called Policy Optimizer for Exponential Models (POEM) -- for learning stochastic linear rules for structured output prediction. We present a decomposition of the POEM objective that enables efficient stochastic gradient optimization. POEM is evaluated on several multi-label classification problems showing substantially improved robustness and generalization performance compared to the state-of-the-art.Comment: 10 page

arXiv.org e-Print Archive

CiteSeerX