46 research outputs found

    Courtesy as a Means to Coordinate

    Full text link
    We investigate the problem of multi-agent coordination under rationality constraints. Specifically, role allocation, task assignment, resource allocation, etc. Inspired by human behavior, we propose a framework (CA^3NONY) that enables fast convergence to efficient and fair allocations based on a simple convention of courtesy. We prove that following such convention induces a strategy which constitutes an ϵ\epsilon-subgame-perfect equilibrium of the repeated allocation game with discounting. Simulation results highlight the effectiveness of CA^3NONY as compared to state-of-the-art bandit algorithms, since it achieves more than two orders of magnitude faster convergence, higher efficiency, fairness, and average payoff.Comment: Accepted at AAMAS 2019 (International Conference on Autonomous Agents and Multiagent Systems

    Achieving Diverse Objectives with AI-driven Prices in Deep Reinforcement Learning Multi-agent Markets

    Full text link
    We propose a practical approach to computing market prices and allocations via a deep reinforcement learning policymaker agent, operating in an environment of other learning agents. Compared to the idealized market equilibrium outcome -- which we use as a benchmark -- our policymaker is much more flexible, allowing us to tune the prices with regard to diverse objectives such as sustainability and resource wastefulness, fairness, buyers' and sellers' welfare, etc. To evaluate our approach, we design a realistic market with multiple and diverse buyers and sellers. Additionally, the sellers, which are deep learning agents themselves, compete for resources in a common-pool appropriation environment based on bio-economic models of commercial fisheries. We demonstrate that: (a) The introduced policymaker is able to achieve comparable performance to the market equilibrium, showcasing the potential of such approaches in markets where the equilibrium prices can not be efficiently computed. (b) Our policymaker can notably outperform the equilibrium solution on certain metrics, while at the same time maintaining comparable performance for the remaining ones. (c) As a highlight of our findings, our policymaker is significantly more successful in maintaining resource sustainability, compared to the market outcome, in scarce resource environments

    AI-driven Prices for Externalities and Sustainability in Production Markets

    Get PDF
    Traditional competitive markets do not account for negative externalities; indirect costs that some participants impose on others, such as the cost of over-appropriating a common-pool resource (which diminishes future stock, and thus harvest, for everyone). Quantifying appropriate interventions to market prices has proven to be quite challenging. We propose a practical approach to computing market prices and allocations via a deep reinforcement learning policymaker agent, operating in an environment of other learning agents. Our policymaker allows us to tune the prices with regard to diverse objectives such as sustainability and resource wastefulness, fairness, buyers' and sellers' welfare, etc. As a highlight of our findings, our policymaker is significantly more successful in maintaining resource sustainability, compared to the market equilibrium outcome, in scarce resource environments.LI

    Anytime Heuristic for Weighted Matching Through Altruism-Inspired Behavior

    Get PDF
    We present a novel anytime heuristic (ALMA), inspired by the human principle of altruism, for solving the assignment problem. ALMA is decentralized, completely uncoupled, and requires no communication between the participants. We prove an upper bound on the convergence speed that is polynomial in the desired number of resources and competing agents per resource; crucially, in the realistic case where the aforementioned quantities are bounded independently of the total number of agents/resources, the convergence time remains constant as the total problem size increases. We have evaluated ALMA under three test cases: (i) an anti-coordination scenario where agents with similar preferences compete over the same set of actions, (ii) a resource allocation scenario in an urban environment, under a constant-time constraint, and finally, (iii) an on-line matching scenario using real passenger-taxi data. In all of the cases, ALMA was able to reach high social welfare, while being orders of magnitude faster than the centralized, optimal algorithm. The latter allows our algorithm to scale to realistic scenarios with hundreds of thousands of agents, e.g., vehicle coordination in urban environments

    Putting ridesharing to the test: efficient and scalable solutions and the power of dynamic vehicle relocation

    Get PDF
    We study the optimization of large-scale, real-time ridesharing systems and propose a modular design methodology, Component Algorithms for Ridesharing (CAR). We evaluate a diverse set of CARs (14 in total), focusing on the key algorithmic components of ridesharing. We take a multi-objective approach, evaluating 10 metrics related to global efficiency, complexity, passenger, and platform incentives, in settings designed to closely resemble reality in every aspect, focusing on vehicles of capacity two. To the best of our knowledge, this is the largest and most comprehensive evaluation to date. We (i) identify CARs that perform well on global, passenger, or platform metrics, (ii) demonstrate that lightweight relocation schemes can significantly improve the Quality of Service by up to 50 % , and (iii) highlight a practical, scalable, on-device CAR that works well across all metrics

    Improving Health Information Access in the World\u27s Largest Maternal Mobile Health Program via Bandit Algorithms

    Full text link
    Harnessing the wide-spread availability of cell phones, many nonprofits have launched mobile health (mHealth) programs to deliver information via voice or text to beneficiaries in underserved communities, with maternal and infant health being a key area of such mHealth programs. Unfortunately, dwindling listenership is a major challenge, requiring targeted interventions using limited resources. This paper focuses on Kilkari, the world\u27s largest mHealth program for maternal and child care - with over 3 million active subscribers at a time - launched by India\u27s Ministry of Health and Family Welfare (MoHFW) and run by the non-profit ARRMAN. We present a system called CHAHAK that aims to reduce automated dropouts as well as boost engagement with the program through the strategic allocation of interventions to beneficiaries. Past work in a similar domain has focused on a much smaller scale mHealth program and used markovian restless multiarmed bandits to optimize a single limited intervention resource. However this paper demonstrates the challenges in adopting a markovian approach in Kilkari; therefore CHAHAK instead relies on non-markovian time-series restless bandits, and optimizes multiple interventions to improve listenership. We use real Kilkari data from the Odisha state in India to show CHAHAK\u27s effectiveness in harnessing multiple interventions to boost listenership, benefiting marginalized communities. When deployed CHAHAK will assist the largest maternal mHealth program to date.Published at Innovative Applications of Artificial Intelligence (IAAI 2024

    Scalable Multi-agent Coordination and Resource Sharing

    No full text
    A plethora of real world problems consist of a number of agents that interact, learn, cooperate, coordinate, and compete with others in ever more complex environments. Examples include autonomous vehicles, robotic agents, intelligent infrastructure, IoT devices, and so on. As more and more autonomous agents are deployed in the real-world, it will bring forth the need for novel algorithms, theory, and tools to enable coordination on a massive scale. In this thesis, we develop such tools to tackle two central challenges in multi-agent coordination research: solving allocation problems, and resource sharing, focusing on solutions that are scalable, practical, and applicable to real-world problems. In the first part of the thesis we tackle the problem of allocating resources to agents, i.e., solving a weighted matching problem. Real-world matching problems may occur in massively large systems, they are distributed and information-restrictive, and individuals have to reveal their preferences over the possible matches in order to get a high quality match, which brings forth significant privacy risks. As such, there are three main challenges: complexity, communication, and privacy. Our proposed approach, ALMA, is a practical heuristic designed for real-world, large-scale (10610^6 agents) applications. It is based on a simple altruistic behavioral convention: agents have a higher probability to back-off from contesting a resource if they have good alternatives, potentially freeing the resource for some agent that does not. ALMA tackles all of the aforementioned challenges: it is decentralized, runs on-device, requires no inter-agent communication, converges in constant time -- under reasonable assumptions --, and provides strong, worst-case, privacy guarantees. Moreover, by incorporating learning we can mitigate the loss in social welfare and increase fairness. Finally, rational agents can use such simple conventions, along with an arbitrary signal from the environment, to learn a correlated equilibrium for accessing a set resources, under high congestion. In the second part of the thesis we focus on a critical open problem: the question of cooperation in socio-ecological and socio-economical systems, and sustainability in the use of common-pool resources. In recent years, learning agents, especially deep reinforcement learning agents, have become ubiquitous in such systems. Yet, scaling to environments with a large number of agents and low observability continues to be a challenge. In our work, we focus on common-pool resources. Individuals face strong incentives to appropriate, which results in overuse and even the depletion of the resources. Our goal is to apply simple interventions to steer the population to desirable states. We propose a simple, yet powerful, and robust technique: allow agents to observe an arbitrary common signal from the environment. The agents learn to couple their policies, and avoid depletion in a wider range of settings, while achieving higher social welfare and convergence speed. Finally, we propose a practical approach to computing market prices and allocations via a deep reinforcement learning policymaker agent. Compared to the idealized market equilibrium outcome -- which can not always be efficiently computed -- our policymaker is much more flexible, allowing us to tune the prices with regard to diverse objectives such as sustainability and resource wastefulness, fairness, buyers' and sellers' welfare, etc
    corecore