17 research outputs found

    Exploration via Cost-Aware Subgoal Design

    Full text link
    The problem of exploration in unknown environments continues to pose a challenge for reinforcement learning algorithms, as interactions with the environment are usually expensive or limited. The technique of setting subgoals with an intrinsic reward allows for the use of supplemental feedback to aid agent in environment with sparse and delayed rewards. In fact, it can be an effective tool in directing the exploration behavior of the agent toward useful parts of the state space. In this paper, we consider problems where an agent faces an unknown task in the future and is given prior opportunities to ``practice'' on related tasks where the interactions are still expensive. We propose a one-step Bayes-optimal algorithm for selecting subgoal designs, along with the number of episodes and the episode length, to efficiently maximize the expected performance of an agent. We demonstrate its excellent performance on a variety of tasks and also prove an asymptotic optimality guarantee.Comment: Presented at TARL, ICLR 2019 worksho

    Advances in Bayesian Optimization with Applications in Aerospace Engineering

    Get PDF
    Optimization requires the quantities of interest that define objective functions and constraints to be evaluated a large number of times. In aerospace engineering, these quantities of interest can be expensive to compute (e.g., numerically solving a set of partial differential equations), leading to a challenging optimization problem. Bayesian optimization (BO) is a class of algorithms for the global optimization of expensive-to-evaluate functions. BO leverages all past evaluations available to construct a surrogate model. This surrogate model is then used to select the next design to evaluate. This paper reviews two recent advances in BO that tackle the challenges of optimizing expensive functions and thus can enrich the optimization toolbox of the aerospace engineer. The first method addresses optimization problems subject to inequality constraints where a finite budget of evaluations is available, a common situation when dealing with expensive models (e.g., a limited time to conduct the optimization study or limited access to a supercomputer). This challenge is addressed via a lookahead BO algorithm that plans the sequence of designs to evaluate in order to maximize the improvement achieved, not only at the next iteration, but once the total budget is consumed. The second method demonstrates how sensitivity information, such as gradients computed with adjoint methods, can be incorporated into a BO algorithm. This algorithm exploits sensitivity information in two ways: first, to enhance the surrogate model, and second, to improve the selection of the next design to evaluate by accounting for future gradient evaluations. The benefits of the two methods are demonstrated on aerospace examples
    corecore