Search CORE

770 research outputs found

Comparing link based and area based pricing mechanisms in traffic assignment

Author: Duarte Alexandre Pinto Brandão
Publication venue
Publication date: 12/02/2020
Field of study

Repositório Aberto da Universidade do Porto

A brief guide to multi-objective reinforcement learning and planning JAAMAS track

Author: Bargiacchi Eugenio
Dazeley Richard
Hayes Conor
Heintz Frederik
Howley Enda
Irissappane Aathirai
Källström Johan
Macfarlane Matthew
Mannion Patrick
Nowé Ann
Ramos Gabriel
Restelli Marcello
Reymond Mathieu
Roijers Diederik
Vamplew Peter
Verstraeten Timothy
Zintgraf Luisa
Publication venue: International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Publication date: 01/01/2023
Field of study

Real-world sequential decision-making tasks are usually complex, and require trade-offs between multiple - often conflicting - objectives. However, the majority of research in reinforcement learning (RL) and decision-theoretic planning assumes a single objective, or that multiple objectives can be handled via a predefined weighted sum over the objectives. Such approaches may oversimplify the underlying problem, and produce suboptimal results. This extended abstract outlines the limitations of using a semi-blind iterative process to solve multi-objective decision making problems. Our extended paper [4], serves as a guide for the application of explicitly multi-objective methods to difficult problems. © 2023 International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved

Federation ResearchOnline

A Data-driven Pricing Scheme for Optimal Routing through Artificial Currencies

Author: Salazar Mauro
Schoukens Maarten
van de Sanden David
Publication venue
Publication date: 27/11/2022
Field of study

Mobility systems often suffer from a high price of anarchy due to the uncontrolled behavior of selfish users. This may result in societal costs that are significantly higher compared to what could be achieved by a centralized system-optimal controller. Monetary tolling schemes can effectively align the behavior of selfish users with the system-optimum. Yet, they inevitably discriminate the population in terms of income. Artificial currencies were recently presented as an effective alternative that can achieve the same performance, whilst guaranteeing fairness among the population. However, those studies were based on behavioral models that may differ from practical implementations. This paper presents a data-driven approach to automatically adapt artificial-currency tolls within repetitive-game settings. We first consider a parallel-arc setting whereby users commute on a daily basis from a unique origin to a unique destination, choosing a route in exchange of an artificial-currency price or reward while accounting for the impact of the choices of the other users on travel discomfort. Second, we devise a model-based reinforcement learning controller that autonomously learns the optimal pricing policy by interacting with the proposed framework considering the closeness of the observed aggregate flows to a desired system-optimal distribution as a reward function. Our numerical results show that the proposed data-driven pricing scheme can effectively align the users' flows with the system optimum, significantly reducing the societal costs with respect to the uncontrolled flows (by about 15% and 25% depending on the scenario), and respond to environmental changes in a robust and efficient manner

arXiv.org e-Print Archive

Recommended from our members

Dynamic congestion pricing in within-day and day-to-day network equilibrium models

Author: Rambha Tarun
Publication venue
Publication date: 19/10/2016
Field of study

This dissertation explores two kinds of dynamic pricing models which react to within-day and day-to-day variation in traffic. Traffic patterns vary within each day due to uncertainty in the supply-side that is caused by non-recurring sources of congestion such as incidents, poor weather, and temporary bottlenecks. On the other hand, significant day-to-day variations in traffic patterns also arise from stochastic route choices of travelers who are not fully rational. Using slightly different assumptions, we analyze the network performance in these two scenarios and demonstrate the advantages of dynamic pricing over static tolls. In both cases, traffic networks are characterized by a set of stochastic states. We seek optimal tolls that are a function of the network states which evolve within each day or across days. In the within-day equilibrium models, travelers are assumed to be completely rational and have knowledge of stochastic link-states, which have different delay functions. At every node, travelers observe the link-states of downstream links and select the next node to minimize their expected travel times. Collectively, such behavior leads to an equilibrium, which is also referred to as user equilibrium with recourse, in which all used routing policies have equal and minimal expected travel time. In this dissertation, we improve the system performance of the equilibrium flows using state-dependent marginal link tolls. These tolls address externalities associated with non-recurring congestion just as static marginal tolls in regular traffic assignment reflect externalities related to recurring congestion. The set of tolls that improve system performance are not necessarily unique. Hence, in order to make the concept of tolling more acceptable to the public, we explore alternate pricing mechanisms that optimize social welfare and also collect the least amount of revenue in expectation. This minimum revenue toll model is formulated as a linear program whose inputs are derived from the solution to a novel reformulation of the user equilibrium with recourse problem. We also study day-to-day dynamic models which unlike traditional equilibrium approaches capture the fluctuations or stochasticity in traffic due to route choice uncertainty. Travelers decisions are modeled using route choice dynamics, such as the logit choice protocol, that depend on historic network conditions. The evolution of the system is modeled as a stochastic process and its steady state is used to characterize the network performance. The objective of pricing in this context is to set dynamic tolls that depend on the state of the network on previous day(s) such that the expected total system travel time is minimized. This problem is formulated as an average cost Markov decision process. Approximation methods are suggested to improve computational tractability. The day-to-day pricing models are extended to instances in which closed form dynamics are unavailable or unfit to represent travelers' choices. In such cases, we apply Q-learning in which the route choices may be simulated off-line or can be observed through experimentation in an online setting. The off-line methods were found to be promising and can be used in conjunction with complex discrete choice models that predict travel behavior with greater accuracy. Overall, the findings in this dissertation highlight the pitfalls of using static tolls in the presence of different types of stochasticity and make a strong case for employing dynamic state-dependent tolls to improve system efficiency.Civil, Architectural, and Environmental Engineerin

Texas ScholarWorks

Agent-Based Modeling and Simulation for the Bus-Corridor Problem in a Many-to-One Mass Transit System

Author: Ning Jia
Qinmu Xie
Shoufeng Ma
Yang Gao
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

With the growing problem of urban traffic congestion, departure time choice is becoming a more important factor to commuters. By using multiagent modeling and the Bush-Mosteller reinforcement learning model, we simulated the day-to-day evolution of commuters’ departure time choice on a many-to-one mass transit system during the morning peak period. To start with, we verified the model by comparison with traditional analytical methods. Then the formation process of departure time equilibrium is investigated additionally. Seeing the validity of the model, some initial assumptions were relaxed and two groups of experiments were carried out considering commuters’ heterogeneity and memory limitations. The results showed that heterogeneous commuters’ departure time distribution is broader and has a lower peak at equilibrium and different people behave in different pattern. When each commuter has a limited memory, some fluctuations exist in the evolutionary dynamics of the system, and hence an ideal equilibrium can hardly be reached. This research is helpful in acquiring a better understanding of commuter’s departure time choice and commuting equilibrium of the peak period; the approach also provides an effective way to explore the formation and evolution of complicated traffic phenomena

Crossref

Directory of Open Access Journals

Recommended from our members

Harnessing Big Data for the Sharing Economy in Smart Cities

Author: Shou Zhenyu
Publication venue
Publication date: 01/01/2021
Field of study

Motivated by the imbalance between demand (i.e., passenger requests) and supply (i.e., available vehicles) in the ride-hailing market and severe traffic congestion faced by modern cities, this dissertation aims to improve the efficiency of the sharing economy by building an agent-based methodological framework for optimal decision-making of distributed agents (e.g., autonomous shared vehicles), including passenger-seeking and route choice. Furthermore, noticing that city planners can impact the behavior of agents via some operational measures such as congestion pricing and signal control, this dissertation investigates the overall bilevel problem that involves the decision-making process of both distributed agents (i.e., the lower level) and central city planners (i.e., the upper level). First of all, for the task of passenger-seeking, this dissertation proposes a model-based Markov decision process (MDP) approach to incorporate distinct features of e-hailing drivers. The modified MDP approach is found to outperform the baseline (i.e., the local hotspot strategy) in terms of both the rate of return and the utilization rate. Although the modified MDP approach is set up in the single-agent setting, we extend its applicability to multi-agent scenarios by a dynamic adjustment strategy of the order matching probability which is able to partially capture the competition among agents. Furthermore, noticing that the reward function is commonly assumed as some prior knowledge, this dissertation unveils the underlying reward function of the overall e-hailing driver population (i.e., 44,000 Didi drivers in Beijing) through an inverse reinforcement learning method, which paves the way for future research on discovering the underlying reward mechanism in a complex and dynamic ride-hailing market. To better incorporate the competition among agents, this dissertation develops a model-free mean-field multi-agent actor-critic algorithm for multi-driver passenger-seeking. A bilevel optimization model is then formulated with the upper level as a reward design mechanism and the lower level as a multi-agent system. We use the developed mean field multi-agent actor-critic algorithm to solve for the optimal passenger-seeking policies of distributed agents in the lower level and Bayesian optimization to solve for the optimal control of upper-level city planners. The bilevel optimization model is applied to a real-world large-scale multi-class taxi driver repositioning task with congestion pricing as the upper-level control. It is disclosed that the derived optimal toll charge can efficiently improve the objective of city planners. With agents knowingwhere to go (i.e., passenger-seeking), this dissertation then applies the bilevel optimization model to the research question of how to get there (i.e., route choice). Different from the task of passenger-seeking where the action space is always fixed-dimensional, the problem of variable action set emerges in the task of route choice. Therefore, a flow-dependent deep Q-learning algorithm is proposed to efficiently derive the optimal policies for multi-commodity multi-class agents. We demonstrate the effect of two countermeasures, namely tolling and signal control, on the behavior of travelers and show that the systematic objective of city planners can be optimized by a proper control

Columbia University Academic Commons

A practical guide to multi-objective reinforcement learning and planning

Author: Bargiacchi Eugenio
Dazeley Richard
Hayes Conor
Heintz Frederick
Howley Enda
Irissappane Athirai
Källström Johan
Macfarlane Matthew
Mannion Patrick
Nowé Ann
Ramos Gabriel
Restelli Marcello
Reymond Mathieu
Roijers Diederik
Rădulescu Roxana
Vamplew Peter
Verstraeten Timothy
Zintgraf Luisa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Real-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems. © 2022, The Author(s)

Federation ResearchOnline