11,735 research outputs found

    Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing

    Full text link
    Within the context of autonomous driving a model-based reinforcement learning algorithm is proposed for the design of neural network-parameterized controllers. Classical model-based control methods, which include sampling- and lattice-based algorithms and model predictive control, suffer from the trade-off between model complexity and computational burden required for the online solution of expensive optimization or search problems at every short sampling time. To circumvent this trade-off, a 2-step procedure is motivated: first learning of a controller during offline training based on an arbitrarily complicated mathematical system model, before online fast feedforward evaluation of the trained controller. The contribution of this paper is the proposition of a simple gradient-free and model-based algorithm for deep reinforcement learning using task separation with hill climbing (TSHC). In particular, (i) simultaneous training on separate deterministic tasks with the purpose of encoding many motion primitives in a neural network, and (ii) the employment of maximally sparse rewards in combination with virtual velocity constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl

    A space of goals: the cognitive geometry of informationally bounded agents

    Get PDF
    © 2022 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/Traditionally, Euclidean geometry is treated by scientists as a priori and objective. However, when we take the position of an agent, the problem of selecting a best route should also factor in the abilities of the agent, its embodiment and particularly its cognitive effort. In this paper, we consider geometry in terms of travel between states within a world by incorporating information processing costs with the appropriate spatial distances. This induces a geometry that increasingly differs from the original geometry of the given world as information costs become increasingly important. We visualize this ‘cognitive geometry’ by projecting it onto two- and three-dimensional spaces showing distinct distortions reflecting the emergence of epistemic and information-saving strategies as well as pivot states. The analogies between traditional cost-based geometries and those induced by additional informational costs invite a generalization of the notion of geodesics as cheapest routes towards the notion of infodesics. In this perspective, the concept of infodesics is inspired by the property of geodesics that, travelling from a given start location to a given goal location along a geodesic, not only the goal, but all points along the way are visited at optimal cost from the start.Peer reviewe

    A Bayesian approach to constrained single- and multi-objective optimization

    Get PDF
    This article addresses the problem of derivative-free (single- or multi-objective) optimization subject to multiple inequality constraints. Both the objective and constraint functions are assumed to be smooth, non-linear and expensive to evaluate. As a consequence, the number of evaluations that can be used to carry out the optimization is very limited, as in complex industrial design optimization problems. The method we propose to overcome this difficulty has its roots in both the Bayesian and the multi-objective optimization literatures. More specifically, an extended domination rule is used to handle objectives and constraints in a unified way, and a corresponding expected hyper-volume improvement sampling criterion is proposed. This new criterion is naturally adapted to the search of a feasible point when none is available, and reduces to existing Bayesian sampling criteria---the classical Expected Improvement (EI) criterion and some of its constrained/multi-objective extensions---as soon as at least one feasible point is available. The calculation and optimization of the criterion are performed using Sequential Monte Carlo techniques. In particular, an algorithm similar to the subset simulation method, which is well known in the field of structural reliability, is used to estimate the criterion. The method, which we call BMOO (for Bayesian Multi-Objective Optimization), is compared to state-of-the-art algorithms for single- and multi-objective constrained optimization

    Hybrid meta-heuristics for combinatorial optimization

    Get PDF
    Combinatorial optimization problems arise, in many forms, in vari- ous aspects of everyday life. Nowadays, a lot of services are driven by optimization algorithms, enabling us to make the best use of the available resources while guaranteeing a level of service. Ex- amples of such services are public transportation, goods delivery, university time-tabling, and patient scheduling. Thanks also to the open data movement, a lot of usage data about public and private services is accessible today, sometimes in aggregate form, to everyone. Examples of such data are traffic information (Google), bike sharing systems usage (CitiBike NYC), location services, etc. The availability of all this body of data allows us to better understand how people interacts with these services. However, in order for this information to be useful, it is necessary to develop tools to extract knowledge from it and to drive better decisions. In this context, optimization is a powerful tool, which can be used to improve the way the available resources are used, avoid squandering, and improve the sustainability of services. The fields of meta-heuristics, artificial intelligence, and oper- ations research, have been tackling many of these problems for years, without much interaction. However, in the last few years, such communities have started looking at each other’s advance- ments, in order to develop optimization techniques that are faster, more robust, and easier to maintain. This effort gave birth to the fertile field of hybrid meta-heuristics.openDottorato di ricerca in Ingegneria industriale e dell'informazioneopenUrli, Tommas
    • …
    corecore