Search CORE

8 research outputs found

Solving Games with Functional Regret Estimation

Author: Bagnell J. Andrew
Bowling Michael
Morrill Dustin
Waugh Kevin
Publication venue
Publication date: 31/12/2014
Field of study

We propose a novel online learning method for minimizing regret in large extensive-form games. The approach learns a function approximator online to estimate the regret for choosing a particular action. A no-regret algorithm uses these estimates in place of the true regrets to define a sequence of policies. We prove the approach sound by providing a bound relating the quality of the function approximation and regret of the algorithm. A corollary being that the method is guaranteed to converge to a Nash equilibrium in self-play so long as the regrets are ultimately realizable by the function approximator. Our technique can be understood as a principled generalization of existing work on abstraction in large games; in our work, both the abstraction as well as the equilibrium are learned during self-play. We demonstrate empirically the method achieves higher quality strategies than state-of-the-art abstraction techniques given the same resources.Comment: AAAI Conference on Artificial Intelligence 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Risk-aware navigation for UAV digital data collection

Author: Xing Zhi
Publication venue: SURFACE at Syracuse University
Publication date: 25/08/2017
Field of study

This thesis studies the navigation task for autonomous UAVs to collect digital data in a risky environment. Three problem formulations are proposed according to different real-world situations. First, we focus on uniform probabilistic risk and assume UAV has unlimited amount of energy. With these assumptions, we provide the graph-based Data-collecting Robot Problem (DRP) model, and propose heuristic planning solutions that consist of a clustering step and a tour building step. Experiments show our methods provide high-quality solutions with high expected reward. Second, we investigate non-uniform probabilistic risk and limited energy capacity of UAV. We present the Data-collection Problem (DCP) to model the task. DCP is a grid-based Markov decision process, and we utilize reinforcement learning with a deep Ensemble Navigation Network (ENN) to tackle the problem. Given four simple navigation algorithms and some additional heuristic information, ENN is able to find improved solutions. Finally, we consider the risk in the form of an opponent and limited energy capacity of UAV, for which we resort to the Data-collection Game (DCG) model. DCG is a grid-based two-player stochastic game where the opponent may have different strategies. We propose opponent modeling to improve data-collection efficiency, design four deep neural networks that model the opponent\u27s behavior at different levels, and empirically prove that explicit opponent modeling with a dedicated network provides superior performance

Syracuse University Research Facility and Collaborative Environment

Dynamic Opponent Modelling in Two-Player Games

Author: Mealing Richard
Publication venue
Publication date: 01/08/2015
Field of study

The University of Manchester - Institutional Repository

Autonomous Agents Modelling Other Agents: A Comprehensive Survey and Open Problems

Author: Abdul-Rahman
Ahmadi
Albrecht
Albrecht
Albrecht
Albrecht
Albrecht
Albrecht
Albrecht
Albrecht
Albrecht
Albrecht
Albrecht
Albrecht
Alonso
Anderson
Aumann
Avrahami-Zilberbrand
Avrahami-Zilberbrand
Avrahami-Zilberbrand
Baarslag
Baker
Baker
Baker
Bakkes
Banerjee
Banerjee
Banerjee
Bard
Bard
Barrett
Barrett
Barrett
Baré
Bellman
Bengio
Billings
Blaylock
Blaylock
Blaylock
Bloembergen
Bolander
Bombini
Borck
Boutilier
Boutilier
Bowling
Bowling
Bowling
Boyen
Brown
Browne
Buehler
Bui
Busoniu
Cadilhac
Camerer
Camerer
Campbell
Carberry
Carmel
Carmel
Carmel
Carmel
Carmel
Carmel
Carmel
Chajewska
Chajewska
Chakraborty
Chakraborty
Chalkiadakis
Chaloner
Chandrasekaran
Charniak
Claus
Coehoorn
Cohen
Cohen
Conitzer
Cortes
Crandall
Dasgupta
Davidson
Davison
de Farias
de Weerd
de Weerd
Dean
Dekel
Denzinger
Doshi
Doshi
Doshi
Doshi
Doucet
Erdogan
Fagan
Fagundes
Fern
Fikes
Foster
Foster
Fredkin
Fudenberg
Fürnkranz
Gal
Gal
Gal
Gal
Ganzfried
Geib
Geib
Geib
Geib
Ghaderi
Gmytrasiewicz
Gmytrasiewicz
Gmytrasiewicz
Gmytrasiewicz
Gmytrasiewicz
Gold
Gold
Goodie
Grosz
Grosz
Guerra-Hernández
Hammond
Harsanyi
Harsanyi
Harsanyi
Harsanyi
Hart
Hausknecht
Hawasly
He
Hedden
Hernandez-Leal
Hernandez-Leal
Hindriks
Hoang
Hoehn
Hong
Hong
Horst
Howard
Howard
Howard
Hsieh
Huynh
Iglesias
Iglesias
Iida
Iida
Iida
Illobre
Jarvis
Jensen
Jensen
Johanson
Johanson
Kaelbling
Kalai
Kaminka
Kaminka
Karpinskyj
Kautz
Kearns
Keren
Keren
Keren
Kerkez
Kitano
Kocsis
Koller
Koller
Kolodner
Kominis
Kuhlmann
La Mura
Lasota
Lattner
Laviers
Ledezma
Lesh
Litman
Lockett
Löwe
Markovitch
McCalla
McCarthy
McCarthy
McCracken
McTear
Mealing
Milch
Millington
Miorandi
Mor
Muggleton
Mui
Muise
Myerson
Nachbar
Nash
Ng
Ng
Nguyen
Nielsen
Nyarko
Oh
Olorunleke
Panait
Panella
Pearl
Peter Stone
Pinyol
Pitt
Pollack
Pourmehr
Powers
Pynadath
Ramchurn
Ramırez
Ramírez
Ramírez
Rathnasabapathy
Reibman
Riley
Riley
Rovatsos
Royer
Rubin
Sabater
Sadigh
Saria
Schadd
Schillo
Schmid
Schmidt
Sen
Sen
Settles
Shachter
Silver
Singh
Sohrabi
Sondik
Sonu
Southey
Spronck
Stefano V. Albrecht
Steffens
Steffens
Steffens
Stone
Stone
Stone
Stone
Sukthankar
Sukthankar
Sukthankar
Suryadi
Synnaeve
Takahashi
Tambe
Tambe
Tambe
Tambe
Tian
Tuyls
van den Herik
Van Der Hoek
Veloso
Vered
Vickrey
Vidal
Visser
Von Neumann
Wang
Watkins
Wayllace
Weber
Wilks
Wright
Yoshida
Yu
Zeng
Zhuo
Zhuo
Zukerman
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Much research in artificial intelligence is concerned with the development of autonomous agents that can interact effectively with other agents. An important aspect of such agents is the ability to reason about the behaviours of other agents, by constructing models which make predictions about various properties of interest (such as actions, goals, beliefs) of the modelled agents. A variety of modelling approaches now exist which vary widely in their methodology and underlying assumptions, catering to the needs of the different sub-communities within which they were developed and reflecting the different practical uses for which they are intended. The purpose of the present article is to provide a comprehensive survey of the salient modelling methods which can be found in the literature. The article concludes with a discussion of open problems which may form the basis for fruitful future research.Comment: Final manuscript (46 pages), published in Artificial Intelligence Journal. The arXiv version also contains a table of contents after the abstract, but is otherwise identical to the AIJ version. Keywords: autonomous agents, multiagent systems, modelling other agents, opponent modellin

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Cooperation in Games

Author: Damer Steven
Publication venue
Publication date: 01/05/2019
Field of study

University of Minnesota Ph.D. dissertation. 2019. Major: Computer Science. Advisor: Maria Gini. 1 computer file (PDF); 159 pages.This dissertation explores several problems related to social behavior, which is a complex and difficult problem. In this dissertation we describe ways to solve problems for agents interacting with opponents, specifically (1) identifying cooperative strategies,(2) acting on fallible predictions, and (3) determining how much to compromise with the opponent. In a multi-agent environment an agent’s interactions with its opponent can significantly affect its performance. However, it is not always possible for the agent to fully model the behavior of the opponent and compute a best response. We present three algorithms for agents to use when interacting with an opponent too complex to be modelled. An agent which wishes to cooperate with its opponent must first identify what strategy constitutes a cooperative action. We address the problem of identifying cooperative strategies in repeated randomly generated games by modelling an agent’s intentions with a real number, its attitude, which is used to produce a modified game; the Nash equilibria of the modified game implement the strategies described by the intentions used to generate the modified game. We demonstrate how these values can be learned, and show how they can be used to achieve cooperation through reciprocation in repeated randomly generated normal form games. Next, an agent which has formed a prediction of opponent behavior which maybe incorrect needs to be able to take advantage of that prediction without adopting a strategy which is overly vulnerable to exploitation. We have developed Restricted Stackelberg Response with Safety (RSRS), an algorithm which can produce a strategy to respond to a prediction while balancing the priorities of performance against the prediction, worst-case performance, and performance against a best-responding opponent. By balancing those concerns appropriately the agent can perform well against an opponent which it cannot reliably predict. Finally we look at how an agent can manipulate an opponent to choose actions which benefit the agent. This problem is often complicated by the difficulty of analyzing the game the agent is playing. To address this issue, we begin by developing a new game, the Gift Exchange game, which is trivial to analyze; the only question is how the opponent will react. We develop a variety of strategies the agent can use when playing the game, and explore how the best strategy is affected by the agent’s discount factor and prior over opponents

University of Minnesota Digital Conservancy