707 research outputs found

    TOKEN-BASED APPROACH FOR SCALABLE TEAMCOORDINATION

    Get PDF
    To form a cooperative multiagent team, autonomous agents are required to harmonize activities and make the best use of exclusive resources to achieve their common goal. In addition, to handle uncertainty and quickly respond to external environmental events, they should share knowledge and sensor in formation. Unlike small team coordination, agents in scalable team must limit the amount of their communications while maximizing team performance. Communication decisions are critical to scalable-team coordination because agents should target their communications, but these decisions cannot be supported by a precise model or by complete team knowledge.The hypothesis of my thesis is: local routing of tokens encapsulating discrete elements of control, based only on decentralized local probability decision models, will lead to efficient scalable coordination with several hundreds of agents. In my research, coordination controls including all domain knowledge, tasks and exclusive resources are encapsulated into tokens. By passing tokens around, agents transfer team controls encapsulated in the tokens. The team benefits when a token is passed to an agent who can make use of it, but communications incur costs. Hence, no single agent has sole responsible over any shared decision. The key problem lies in how agents make the correct decisions to target communications and pass tokens so that they will potentially benefit the team most when considering communication costs.My research on token-based coordination algorithm starts from the investigation of random walk of token movement. I found a little increase of the probabilities that agents make the right decision to pass a token, the overall efficiency of the token movement could be greatly enhanced. Moreover, if token movements are modeled as a Markov chain, I found that the efficiency of passing tokens could be significantly varied based on different network topologies.My token-based algorithm starts at the investigation of each single decision theoretic agents. Although under the uncertainties that exist in large multiagent teams, agents cannot act optimal, it is still feasible to build a probability model for each agents to rationally pass tokens. Specifically, this decision only allow agent to pass tokens over an associate network where only a few of team members are considered as token receiver.My proposed algorithm will build each agent's individual decision model based on all of its previously received tokens. This model will not require the complete knowledge of the team. The key idea is that I will make use of the domain relationships between pairs of coordination controls. Previously received tokens will help the receiver to infer whether the sender could benefit the team if a related token is received. Therefore, each token is used to improve the routing of other tokens, leading to a dramatic performance improvement when more tokens are added. By exploring the relationships between different types of coordination controls, an integrated coordination algorithm will be built, and an improvement of one aspect of coordination will enhance the performance of the others

    A Distributed Algorithm for Solving a Class of Multi-agent Markov Decision Problems

    Get PDF
    We consider a class of infinite horizon Markov decision processes (MDPs) with multiple decision makers, called agents,and a general joint reward structure, but a special decomposable state/action structure such that each individual agent's actions affect the system's state transitions independently from the actions of all other agents. We introduce the concept of ``localization," where each agent need only consider a ``local" MDP defined on its own state and action spaces. Based on this localization concept, we propose an iterative distributed algorithm that emulates gradient ascent and which converges to a locally optimal solution for the average reward case. The solution is an ``autonomous" joint policy such that each agent's action is based on only its local state. Finally, we discuss the implication of the localization concept for discounted reward problems

    EMVLight: a Multi-agent Reinforcement Learning Framework for an Emergency Vehicle Decentralized Routing and Traffic Signal Control System

    Full text link
    Emergency vehicles (EMVs) play a crucial role in responding to time-critical calls such as medical emergencies and fire outbreaks in urban areas. Existing methods for EMV dispatch typically optimize routes based on historical traffic-flow data and design traffic signal pre-emption accordingly; however, we still lack a systematic methodology to address the coupling between EMV routing and traffic signal control. In this paper, we propose EMVLight, a decentralized reinforcement learning (RL) framework for joint dynamic EMV routing and traffic signal pre-emption. We adopt the multi-agent advantage actor-critic method with policy sharing and spatial discounted factor. This framework addresses the coupling between EMV navigation and traffic signal control via an innovative design of multi-class RL agents and a novel pressure-based reward function. The proposed methodology enables EMVLight to learn network-level cooperative traffic signal phasing strategies that not only reduce EMV travel time but also shortens the travel time of non-EMVs. Simulation-based experiments indicate that EMVLight enables up to a 42.6%42.6\% reduction in EMV travel time as well as an 23.5%23.5\% shorter average travel time compared with existing approaches.Comment: 19 figures, 10 tables. Manuscript extended on previous work arXiv:2109.05429, arXiv:2111.0027
    • …
    corecore