3 research outputs found

    Solving MDPs with Skew Symmetric Bilinear Utility Functions

    No full text
    International audienceIn this paper we adopt Skew Symmetric Bilinear (SSB) utility functions to compare policies in Markov Decision Processes (MDPs). By considering pairs of alternatives, SSB utility theory generalizes von Neumann and Morgenstern’s expected utility (EU) theory to encompass rational decision behaviors that EU cannot accommodate. We provide a game-theoretic analysis of the problem of identifying an SSB-optimal policy in finite horizon MDPs and propose an algorithm based on a double oracle approach for computing an optimal (possibly randomized) policy. Finally, we present and discuss experimental results where SSB-optimal policies are computed for a popular TV contest according to several instantiations of SSB utility functions

    Socially Optimal Personalized Routing With Preference Learning

    Get PDF
    Traffic congestion has become inescapable across the United States, especially in urban areas. Yet, support is lacking for taxes to fund expansion of the existing network. Thus, it is imperative to find novel ways to improve efficiency of the existing infrastructure. A major obstacle is the inability to enforce socially optimal routes among the commuters. We propose to improve routing efficiency by leveraging heterogeneity in commuter preferences. We learn individual driver preferences over the route characteristics and use these preferences to recommend socially optimal routes that they will likely follow. The combined effects of socially optimal routing and personalization help bridge the gap between utopic and user optimal solutions. We take the view of a recommendation system with a large userbase but no ability to enforce routes in a highly congested network. We (a) develop a framework for learning individual driver preferences overtime, and (b) devise a mathematical model for computing personalized socially optimal routes given (potentially partial) information on driver preferences. We evaluated our approach on data collected from Amazon Mechanical Turk and compared with Logistic Regression and our model improves prediction accuracy by over 12%
    corecore