32,661 research outputs found

    Solutions of max-plus linear equations and large deviations

    Full text link
    We generalise the Gartner-Ellis theorem of large deviations theory. Our results allow us to derive large deviation type results in stochastic optimal control from the convergence of generalised logarithmic moment generating functions. They rely on the characterisation of the uniqueness of the solutions of max-plus linear equations. We give an illustration for a simple investment model, in which logarithmic moment generating functions represent risk-sensitive values.Comment: 6 page

    Do optimization methods in deep learning applications matter?

    Get PDF
    With advances in deep learning, exponential data growth and increasing model complexity, developing efficient optimization methods are attracting much research attention. Several implementations favor the use of Conjugate Gradient (CG) and Stochastic Gradient Descent (SGD) as being practical and elegant solutions to achieve quick convergence, however, these optimization processes also present many limitations in learning across deep learning applications. Recent research is exploring higher-order optimization functions as better approaches, but these present very complex computational challenges for practical use. Comparing first and higher-order optimization functions, in this paper, our experiments reveal that Levemberg-Marquardt (LM) significantly supersedes optimal convergence but suffers from very large processing time increasing the training complexity of both, classification and reinforcement learning problems. Our experiments compare off-the-shelf optimization functions(CG, SGD, LM and L-BFGS) in standard CIFAR, MNIST, CartPole and FlappyBird experiments.The paper presents arguments on which optimization functions to use and further, which functions would benefit from parallelization efforts to improve pretraining time and learning rate convergence

    Transport of Mars-Crossing Asteroids from the Quasi-Hilda Region

    Get PDF
    We employ set oriented methods in combination with graph partitioning algorithms to identify key dynamical regions in the Sun-Jupiter-particle three-body system. Transport rates from a region near the 3:2 Hilda resonance into the realm of orbits crossing Mars' orbit are computed. In contrast to common numerical approaches, our technique does not depend on single long term simulations of the underlying model. Thus, our statistical results are particularly reliable since they are not affected by a dynamical behavior which is almost nonergodic (i.e., dominated by strongly almost invariant sets)

    Counterfactual Risk Minimization: Learning from Logged Bandit Feedback

    Full text link
    We develop a learning principle and an efficient algorithm for batch learning from logged bandit feedback. This learning setting is ubiquitous in online systems (e.g., ad placement, web search, recommendation), where an algorithm makes a prediction (e.g., ad ranking) for a given input (e.g., query) and observes bandit feedback (e.g., user clicks on presented ads). We first address the counterfactual nature of the learning problem through propensity scoring. Next, we prove generalization error bounds that account for the variance of the propensity-weighted empirical risk estimator. These constructive bounds give rise to the Counterfactual Risk Minimization (CRM) principle. We show how CRM can be used to derive a new learning method -- called Policy Optimizer for Exponential Models (POEM) -- for learning stochastic linear rules for structured output prediction. We present a decomposition of the POEM objective that enables efficient stochastic gradient optimization. POEM is evaluated on several multi-label classification problems showing substantially improved robustness and generalization performance compared to the state-of-the-art.Comment: 10 page
    • …
    corecore