Massively Scalable Inverse Reinforcement Learning in Google Maps

Abueg, Matthew; Barnes, Matt; Deeds, Matt; Lange, Oliver F.; Molitor, Denali; O'Banion, Shawn; Trader, Jason; Wulfmeier, Markus

Massively Scalable Inverse Reinforcement Learning in Google Maps

Authors: Matthew Abueg
Matt Barnes
Matt Deeds
Oliver F. Lange
Denali Molitor
Shawn O'Banion
Jason Trader
Markus Wulfmeier
Publication date: 18 May 2023
Publisher

Abstract

Optimizing for humans' latent preferences is a grand challenge in route recommendation, where globally-scalable solutions remain an open problem. Although past work created increasingly general solutions for the application of inverse reinforcement learning (IRL), these have not been successfully scaled to world-sized MDPs, large datasets, and highly parameterized models; respectively hundreds of millions of states, trajectories, and parameters. In this work, we surpass previous limitations through a series of advancements focused on graph compression, parallelization, and problem initialization based on dominant eigenvectors. We introduce Receding Horizon Inverse Planning (RHIP), which generalizes existing work and enables control of key performance trade-offs via its planning horizon. Our policy achieves a 16-24% improvement in global route quality, and, to our knowledge, represents the largest instance of IRL in a real-world setting to date. Our results show critical benefits to more sustainable modes of transportation (e.g. two-wheelers), where factors beyond journey time (e.g. route safety) play a substantial role. We conclude with ablations of key components, negative results on state-of-the-art eigenvalue solvers, and identify future opportunities to improve scalability via IRL-specific batching strategies

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2305.11290

Last time updated on 24/05/2023