2 research outputs found

    A practical guide to multi-objective reinforcement learning and planning

    Get PDF
    Real-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems. © 2022, The Author(s)

    Pedestrian simulation as multi-objective reinforcement learning

    No full text
    Modelling and simulation of pedestrian crowds require agents to reach pre-determined goals and avoid collisions with static obstacles and dynamic pedestrians, while maintaining natural gait behaviour. We model pedestrians as autonomous, learning, and reactive agents employing Reinforcement Learning (RL). Typical RL-based agent simulations suffer poor generalization due to handcrafted reward function to ensure realistic behaviour. In this work, we model pedestrians in a modular framework integrating navigation and collision-avoidance tasks as separate modules. Each such module consists of independent state-spaces and rewards, but with shared action-spaces. Empirical results suggest that such modular framework learning models can show satisfactory performance without tuning parameters, and we compare it with the state-of-art crowd simulation methods.QC 20190123</p
    corecore