770 research outputs found

    A brief guide to multi-objective reinforcement learning and planning JAAMAS track

    Get PDF
    Real-world sequential decision-making tasks are usually complex, and require trade-offs between multiple - often conflicting - objectives. However, the majority of research in reinforcement learning (RL) and decision-theoretic planning assumes a single objective, or that multiple objectives can be handled via a predefined weighted sum over the objectives. Such approaches may oversimplify the underlying problem, and produce suboptimal results. This extended abstract outlines the limitations of using a semi-blind iterative process to solve multi-objective decision making problems. Our extended paper [4], serves as a guide for the application of explicitly multi-objective methods to difficult problems. © 2023 International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved

    A Data-driven Pricing Scheme for Optimal Routing through Artificial Currencies

    Full text link
    Mobility systems often suffer from a high price of anarchy due to the uncontrolled behavior of selfish users. This may result in societal costs that are significantly higher compared to what could be achieved by a centralized system-optimal controller. Monetary tolling schemes can effectively align the behavior of selfish users with the system-optimum. Yet, they inevitably discriminate the population in terms of income. Artificial currencies were recently presented as an effective alternative that can achieve the same performance, whilst guaranteeing fairness among the population. However, those studies were based on behavioral models that may differ from practical implementations. This paper presents a data-driven approach to automatically adapt artificial-currency tolls within repetitive-game settings. We first consider a parallel-arc setting whereby users commute on a daily basis from a unique origin to a unique destination, choosing a route in exchange of an artificial-currency price or reward while accounting for the impact of the choices of the other users on travel discomfort. Second, we devise a model-based reinforcement learning controller that autonomously learns the optimal pricing policy by interacting with the proposed framework considering the closeness of the observed aggregate flows to a desired system-optimal distribution as a reward function. Our numerical results show that the proposed data-driven pricing scheme can effectively align the users' flows with the system optimum, significantly reducing the societal costs with respect to the uncontrolled flows (by about 15% and 25% depending on the scenario), and respond to environmental changes in a robust and efficient manner

    Agent-Based Modeling and Simulation for the Bus-Corridor Problem in a Many-to-One Mass Transit System

    Get PDF
    With the growing problem of urban traffic congestion, departure time choice is becoming a more important factor to commuters. By using multiagent modeling and the Bush-Mosteller reinforcement learning model, we simulated the day-to-day evolution of commuters’ departure time choice on a many-to-one mass transit system during the morning peak period. To start with, we verified the model by comparison with traditional analytical methods. Then the formation process of departure time equilibrium is investigated additionally. Seeing the validity of the model, some initial assumptions were relaxed and two groups of experiments were carried out considering commuters’ heterogeneity and memory limitations. The results showed that heterogeneous commuters’ departure time distribution is broader and has a lower peak at equilibrium and different people behave in different pattern. When each commuter has a limited memory, some fluctuations exist in the evolutionary dynamics of the system, and hence an ideal equilibrium can hardly be reached. This research is helpful in acquiring a better understanding of commuter’s departure time choice and commuting equilibrium of the peak period; the approach also provides an effective way to explore the formation and evolution of complicated traffic phenomena

    A practical guide to multi-objective reinforcement learning and planning

    Get PDF
    Real-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems. © 2022, The Author(s)
    corecore