22,400 research outputs found

    Methods for many-objective optimization: an analysis

    Get PDF
    Decomposition-based methods are often cited as the solution to problems related with many-objective optimization. Decomposition-based methods employ a scalarizing function to reduce a many-objective problem into a set of single objective problems, which upon solution yields a good approximation of the set of optimal solutions. This set is commonly referred to as Pareto front. In this work we explore the implications of using decomposition-based methods over Pareto-based methods from a probabilistic point of view. Namely, we investigate whether there is an advantage of using a decomposition-based method, for example using the Chebyshev scalarizing function, over Paretobased methods

    Learning from Scarce Experience

    Full text link
    Searching the space of policies directly for the optimal policy has been one popular method for solving partially observable reinforcement learning problems. Typically, with each change of the target policy, its value is estimated from the results of following that very policy. This requires a large number of interactions with the environment as different polices are considered. We present a family of algorithms based on likelihood ratio estimation that use data gathered when executing one policy (or collection of policies) to estimate the value of a different policy. The algorithms combine estimation and optimization stages. The former utilizes experience to build a non-parametric representation of an optimized function. The latter performs optimization on this estimate. We show positive empirical results and provide the sample complexity bound.Comment: 8 pages 4 figure

    A simulation comparison of methods for new product location

    Get PDF
    Includes bibliographical references (p. 29-31)

    Optimization of Bi-Directional V2G Behavior With Active Battery Anti-Aging Scheduling

    Get PDF

    Better Optimism By Bayes: Adaptive Planning with Rich Models

    Full text link
    The computational costs of inference and planning have confined Bayesian model-based reinforcement learning to one of two dismal fates: powerful Bayes-adaptive planning but only for simplistic models, or powerful, Bayesian non-parametric models but using simple, myopic planning strategies such as Thompson sampling. We ask whether it is feasible and truly beneficial to combine rich probabilistic models with a closer approximation to fully Bayesian planning. First, we use a collection of counterexamples to show formal problems with the over-optimism inherent in Thompson sampling. Then we leverage state-of-the-art techniques in efficient Bayes-adaptive planning and non-parametric Bayesian methods to perform qualitatively better than both existing conventional algorithms and Thompson sampling on two contextual bandit-like problems.Comment: 11 pages, 11 figure
    • …
    corecore