32,661 research outputs found
Solutions of max-plus linear equations and large deviations
We generalise the Gartner-Ellis theorem of large deviations theory. Our
results allow us to derive large deviation type results in stochastic optimal
control from the convergence of generalised logarithmic moment generating
functions. They rely on the characterisation of the uniqueness of the solutions
of max-plus linear equations. We give an illustration for a simple investment
model, in which logarithmic moment generating functions represent
risk-sensitive values.Comment: 6 page
Do optimization methods in deep learning applications matter?
With advances in deep learning, exponential data growth and increasing model
complexity, developing efficient optimization methods are attracting much
research attention. Several implementations favor the use of Conjugate Gradient
(CG) and Stochastic Gradient Descent (SGD) as being practical and elegant
solutions to achieve quick convergence, however, these optimization processes
also present many limitations in learning across deep learning applications.
Recent research is exploring higher-order optimization functions as better
approaches, but these present very complex computational challenges for
practical use. Comparing first and higher-order optimization functions, in this
paper, our experiments reveal that Levemberg-Marquardt (LM) significantly
supersedes optimal convergence but suffers from very large processing time
increasing the training complexity of both, classification and reinforcement
learning problems. Our experiments compare off-the-shelf optimization
functions(CG, SGD, LM and L-BFGS) in standard CIFAR, MNIST, CartPole and
FlappyBird experiments.The paper presents arguments on which optimization
functions to use and further, which functions would benefit from
parallelization efforts to improve pretraining time and learning rate
convergence
Transport of Mars-Crossing Asteroids from the Quasi-Hilda Region
We employ set oriented methods in combination with graph partitioning algorithms to identify key dynamical regions in the Sun-Jupiter-particle three-body system. Transport rates from a region near the 3:2 Hilda resonance into the realm of orbits crossing Mars' orbit are computed. In contrast to common numerical approaches, our technique does not depend on single long term simulations of the underlying model. Thus, our statistical results are particularly reliable since they are not affected by a dynamical behavior which is almost nonergodic (i.e., dominated by strongly almost invariant sets)
Counterfactual Risk Minimization: Learning from Logged Bandit Feedback
We develop a learning principle and an efficient algorithm for batch learning
from logged bandit feedback. This learning setting is ubiquitous in online
systems (e.g., ad placement, web search, recommendation), where an algorithm
makes a prediction (e.g., ad ranking) for a given input (e.g., query) and
observes bandit feedback (e.g., user clicks on presented ads). We first address
the counterfactual nature of the learning problem through propensity scoring.
Next, we prove generalization error bounds that account for the variance of the
propensity-weighted empirical risk estimator. These constructive bounds give
rise to the Counterfactual Risk Minimization (CRM) principle. We show how CRM
can be used to derive a new learning method -- called Policy Optimizer for
Exponential Models (POEM) -- for learning stochastic linear rules for
structured output prediction. We present a decomposition of the POEM objective
that enables efficient stochastic gradient optimization. POEM is evaluated on
several multi-label classification problems showing substantially improved
robustness and generalization performance compared to the state-of-the-art.Comment: 10 page
- …