Search CORE

183 research outputs found

Promoting Carpooling through Nudges: The Case of the University Hildesheim

Author: Knackstedt Ralf
Schoormann Thorsten
Werkmeister Coralie
Publication venue: AIS Electronic Library (AISeL)
Publication date: 09/02/2021
Field of study

AIS Electronic Library (AISeL)

Spartan Daily, August 27, 2003

Author: San Jose State University School of Journalism and Mass Communications
Publication venue: SJSU ScholarWorks
Publication date: 27/08/2003
Field of study

Volume 121, Issue 2https://scholarworks.sjsu.edu/spartandaily/9868/thumbnail.jp

SJSU ScholarWorks

Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand

Author: Bertsekas Dimitri
Bhattacharya Sushmita
Garces Daniel
Gil Stephanie
Publication venue
Publication date: 27/11/2022
Field of study

We derive a learning framework to generate routing/pickup policies for a fleet of vehicles tasked with servicing stochastically appearing requests on a city map. We focus on policies that 1) give rise to coordination amongst the vehicles, thereby reducing wait times for servicing requests, 2) are non-myopic, considering a-priori unknown potential future requests, and 3) can adapt to changes in the underlying demand distribution. Specifically, we are interested in adapting to fluctuations of actual demand conditions in urban environments, such as on-peak vs. off-peak hours. We achieve this through a combination of (i) online play, a lookahead optimization method that improves the performance of rollout methods via an approximate policy iteration step, and (ii) an offline approximation scheme that allows for adapting to changes in the underlying demand model. In particular, we achieve adaptivity of our learned policy to different demand distributions by quantifying a region of validity using the q-valid radius of a Wasserstein Ambiguity Set. We propose a mechanism for switching the originally trained offline approximation when the current demand is outside the original validity region. In this case, we propose to use an offline architecture, trained on a historical demand model that is closer to the current demand in terms of Wasserstein distance. We learn routing and pickup policies over real taxicab requests in downtown San Francisco with high variability between on-peak and off-peak hours, demonstrating the ability of our method to adapt to real fluctuation in demand distributions. Our numerical results demonstrate that our method outperforms rollout-based reinforcement learning, as well as several benchmarks based on classical methods from the field of operations research.Comment: 7 pages, 6 figures, 3 tables, submitted to ICR

arXiv.org e-Print Archive