Search CORE

171 research outputs found

A Survey on Sensor Networks from a Multiagent Perspective

Author: Cerquides Jesús
Rodríguez-Aguilar Juan Antonio
Vinyals Meritxell
Publication venue: 'Oxford University Press (OUP)'
Publication date: 19/10/2016
Field of study

Sensor networks (SNs) have arisen as one of the most promising technologies for the next decades. The recent emergence of small and inexpensive sensors based upon microelectromechanical systems ease the development and proliferation of this kind of networks in a wide range of actual-world applications. Multiagent systems (MAS) have been identified as one of the most suitable technologies to contribute to the deployment of SNs that exhibit flexibility, robustness and autonomy. The purpose of this survey is 2-fold. On the one hand, we review the most relevant contributions of agent technologies to this emerging application domain. On the other hand, we identify the challenges that researchers must address to establish MAS as the key enabling technology for SNs.This work has been funded by projects IEA(TIN2006-15662-C02-01), Agreement Technologies (CONSOLIDER CSD2007-0022, INGENIO 2010), EVE (TIN2009-14702-C02-01,TIN2009-14702-C02-02) and Generalitat de Catalunya under the gran t2009-SGR-1434. Meritxell Vinyals is supported by the Spanish Ministry of Education (FPU grant AP2006-04636)Peer Reviewe

Digital.CSIC

Airborne collision avoidance in mixed equipage environments

Author: Asmar Dylan M. (Dylan Mitchell)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2013
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2013.This electronic version was submitted and approved by the author's academic department as part of an electronic thesis pilot project. The certified thesis is available in the Institute Archives and Special Collections."June 2013." Cataloged from department-submitted PDF version of thesisIncludes bibliographical references (p. 93-98).Over the past few years, research has focused on the use of a computational method known as dynamic programming for producing an optimized decision logic for airborne collision avoidance. There have been a series of technical reports, conference papers, and journal articles summarizing the research, but they have primarily investigated two-aircraft encounters with only one aircraft equipped with a collision avoidance system. This thesis looks at recent research on coordination, interoperability, and multiple-threat encounters. In situations where an aircraft encounters another aircraft with a collision avoidance system, it is important that the resolution advisories provided to the pilots be coordinated so that both aircraft are not instructed to maneuver in the same direction. Interoperability is a related consideration since new collision avoidance systems will be occupying the same airspace as legacy systems. Resolving encounters with multiple intruders poses computational challenges that will be addressed in this thesis. The methodology presented in this thesis results in logic that is safer and performs better than the legacy Traffic Alert and Collision Avoidance System (TCAS). To assess the performance of the system, this thesis uses U.S. airspace encounter models. The results indicate that the proposed methodology can bring significant benefit to the current airspace and can support the need for safe, non-disruptive collision protection as the airspace continues to evolve.by Dylan M. Asmar.S.M

DSpace@MIT

Optimal and Approximate Q-value Functions for Decentralized POMDPs

Author: Oliehoek Frans A.
Spaan Matthijs T. J.
Vlassis Nikos
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2008
Field of study

Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q* is computed in a recursive manner by dynamic programming, and then an optimal policy is extracted from Q*. In this paper we study whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions. We define two forms of the optimal Q-value function for Dec-POMDPs: one that gives a normative description as the Q-value function of an optimal pure joint policy and another one that is sequentially rational and thus gives a recipe for computation. This computation, however, is infeasible for all but the smallest problems. Therefore, we analyze various approximate Q-value functions that allow for efficient computation. We describe how they relate, and we prove that they all provide an upper bound to the optimal Q-value function Q*. Finally, unifying some previous approaches for solving Dec-POMDPs, we describe a family of algorithms for extracting policies from such Q-value functions, and perform an experimental evaluation on existing test problems, including a new firefighting benchmark problem

arXiv.org e-Print Archive

CiteSeerX

University of Liverpool Repository

Crossref

Open Repository and Bibliography - Luxembourg

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Solving Common-Payoff Games with Approximate Policy Iteration

Author: Bowling Michael
Burch Neil
D'Orazio Ryan
Davoodi Elnaz
Lanctot Marc
Lockhart Edward
Schmid Martin
Sokota Samuel
Timbers Finbarr
Publication venue
Publication date: 11/01/2021
Field of study

For artificially intelligent learning systems to have widespread applicability in real-world settings, it is important that they be able to operate decentrally. Unfortunately, decentralized control is difficult -- computing even an epsilon-optimal joint policy is a NEXP complete problem. Nevertheless, a recently rediscovered insight -- that a team of agents can coordinate via common knowledge -- has given rise to algorithms capable of finding optimal joint policies in small common-payoff games. The Bayesian action decoder (BAD) leverages this insight and deep reinforcement learning to scale to games as large as two-player Hanabi. However, the approximations it uses to do so prevent it from discovering optimal joint policies even in games small enough to brute force optimal solutions. This work proposes CAPI, a novel algorithm which, like BAD, combines common knowledge with deep reinforcement learning. However, unlike BAD, CAPI prioritizes the propensity to discover optimal joint policies over scalability. While this choice precludes CAPI from scaling to games as large as Hanabi, empirical results demonstrate that, on the games to which CAPI does scale, it is capable of discovering optimal joint policies even when other modern multi-agent reinforcement learning algorithms are unable to do so. Code is available at https://github.com/ssokota/capi .Comment: AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications