3,306 research outputs found
Games on graphs with a public signal monitoring
We study pure Nash equilibria in games on graphs with an imperfect monitoring
based on a public signal. In such games, deviations and players responsible for
those deviations can be hard to detect and track. We propose a generic
epistemic game abstraction, which conveniently allows to represent the
knowledge of the players about these deviations, and give a characterization of
Nash equilibria in terms of winning strategies in the abstraction. We then use
the abstraction to develop algorithms for some payoff functions.Comment: 28 page
Reinforcement learning for trading dialogue agents in non-cooperative negotiations
Recent advances in automating Dialogue Management have been mainly made in
cooperative environments -where the dialogue system tries to help a human to meet
their goals. In non-cooperative environments though, such as competitive trading,
there is still much work to be done. The complexity of such an environment rises
as there is usually imperfect information about the interlocutors’ goals and states.
The thesis shows that non-cooperative dialogue agents are capable of learning how
to successfully negotiate in a variety of trading-game settings, using Reinforcement
Learning, and results are presented from testing the trained dialogue policies with
humans. The agents learned when and how to manipulate using dialogue, how to
judge the decisions of their rivals, how much information they should expose, as well
as how to effectively map the adversarial needs in order to predict and exploit their
actions. Initially the environment was a two-player trading game (“Taikun”). The
agent learned how to use explicit linguistic manipulation, even with risks of exposure
(detection) where severe penalties apply. A more complex opponent model for
adversaries was also implemented, where we modelled all trading dialogue moves as
implicitly manipulating the adversary’s opponent model, and we worked in a more
complex game (“Catan”). In that multi-agent environment we show that agents
can learn to be legitimately persuasive or deceitful. Agents which learned how to
manipulate opponents using dialogue are more successful than ones which do not
manipulate. We also demonstrate that trading dialogues are more successful when
the learning agent builds an estimate of the adversarial hidden goals and preferences.
Furthermore the thesis shows that policies trained in bilateral negotiations can be
very effective in multilateral ones (i.e. the 4-player version of Catan). The findings
suggest that it is possible to train non-cooperative dialogue agents which successfully
trade using linguistic manipulation. Such non-cooperative agents may have
important future applications, such as on automated debating, police investigation,
games, and education
Spectrum auctions: designing markets to benefit the public, industry and the economy
Access to the radio spectrum is vital for modern digital communication. It is an essential component for smartphone capabilities, the Cloud, the Internet of Things, autonomous vehicles, and multiple other new technologies. Governments use spectrum auctions to decide which companies should use what parts of the radio spectrum. Successful auctions can fuel rapid innovation in products and services, unlock substantial economic benefits, build comparative advantage across all regions, and create billions of dollars of government revenues. Poor auction strategies can leave bandwidth unsold and delay innovation, sell national assets to firms too cheaply, or create uncompetitive markets with high mobile prices and patchy coverage that stifles economic growth. Corporate bidders regularly complain that auctions raise their costs, while government critics argue that insufficient revenues are raised. The cross-national record shows many examples of both highly successful auctions and miserable failures. Drawing on experience from the UK and other countries, senior regulator Geoffrey Myers explains how to optimise the regulatory design of auctions, from initial planning to final implementation. Spectrum Auctions offers unrivalled expertise for regulators and economists engaged in practical auction design or company executives planning bidding strategies. For applied economists, teachers, and advanced students this book provides unrivalled insights in market design and public management. Providing clear analytical frameworks, case studies of auctions, and stage-by-stage advice, it is essential reading for anyone interested in designing public-interested and successful spectrum auctions
Games and Strategies as Event Structures.
In 2011, Rideau and Winskel introduced concurrent games and strategies as
event structures, generalizing prior work on causal formulations of games. In this paper we give a detailed, self-contained and slightly-updated account of the results of Rideau and Winskel: a notion of pre-strategy based on event structures; a characterisation of those pre-strategies (deemed strategies) which are preserved by composition with a copycat strategy; and the construction of a bicategory of these strategies. Furthermore, we prove
that the corresponding category has a compact closed structure, and hence forms the basis for the semantics of concurrent higher-order computation
Adversarial Policies Beat Superhuman Go AIs
We attack the state-of-the-art Go-playing AI system KataGo by training
adversarial policies against it, achieving a >97% win rate against KataGo
running at superhuman settings. Our adversaries do not win by playing Go well.
Instead, they trick KataGo into making serious blunders. Our attack transfers
zero-shot to other superhuman Go-playing AIs, and is comprehensible to the
extent that human experts can implement it without algorithmic assistance to
consistently beat superhuman AIs. The core vulnerability uncovered by our
attack persists even in KataGo agents adversarially trained to defend against
our attack. Our results demonstrate that even superhuman AI systems may harbor
surprising failure modes. Example games are available https://goattack.far.ai/.Comment: Accepted to ICML 2023, see paper for changelo
Machine learning applied to the context of Poker
A combinação de princípios da teoria de jogo e metodologias de machine learning aplicados ao contexto de formular estratégias ótimas para jogos está a angariar interesse por parte de uma porção crescentemente significativa da comunidade científica, tornando-se o jogo do Poker num candidato de estudo popular devido à sua natureza de informação imperfeita. Avanços nesta área possuem vastas aplicações em cenários do mundo real, e a área de investigação de inteligência artificial demonstra que o interesse relativo a este objeto de estudo está longe de desaparecer, com investigadores do Facebook e Carnegie Mellon a apresentar, em 2019, o primeiro agente de jogo autónomo de Poker provado como ganhador num cenário com múltiplos jogadores, uma conquista relativamente à anterior especificação do estado da arte, que fora desenvolvida para jogos de apenas 2 jogadores. Este estudo pretende explorar as características de jogos estocásticos de informação imperfeita, recolhendo informação acerca dos avanços nas metodologias disponibilizados por parte de investigadores de forma a desenvolver um agente autónomo de jogo que se pretende inserir na classificação de "utility-maximizing decision-maker".The combination of game theory principles and machine learning methodologies applied to encountering optimal strategies for games is garnering interest from an increasing large portion of the scientific community, with the game of Poker being a popular study subject due to its imperfect information nature. Advancements in this area have a wide array of applications in real-world scenarios, and the field of artificial intelligent studies show that the interest regarding this object of study is yet to fade, with researchers from Facebook and Carnegie Mellon presenting, in 2019, the world’s first autonomous Poker playing agent that is proven to be profitable while confronting multiple players at a time, an achievement in relation to the previous state of the art specification, which was developed for two player games only. This study intends to explore the characteristics of stochastic games of imperfect information, gathering information regarding the advancements in methodologies made available by researchers in order to ultimately develop an autonomous agent intended to adhere to the classification of a utility-maximizing decision-maker
Game theoretic modeling and analysis : A co-evolutionary, agent-based approach
Ph.DDOCTOR OF PHILOSOPH
A Temporal Framework for Hypergame Analysis of Cyber Physical Systems in Contested Environments
Game theory is used to model conflicts between one or more players over resources. It offers players a way to reason, allowing rationale for selecting strategies that avoid the worst outcome. Game theory lacks the ability to incorporate advantages one player may have over another player. A meta-game, known as a hypergame, occurs when one player does not know or fully understand all the strategies of a game. Hypergame theory builds upon the utility of game theory by allowing a player to outmaneuver an opponent, thus obtaining a more preferred outcome with higher utility. Recent work in hypergame theory has focused on normal form static games that lack the ability to encode several realistic strategies. One example of this is when a player’s available actions in the future is dependent on his selection in the past. This work presents a temporal framework for hypergame models. This framework is the first application of temporal logic to hypergames and provides a more flexible modeling for domain experts. With this new framework for hypergames, the concepts of trust, distrust, mistrust, and deception are formalized. While past literature references deception in hypergame research, this work is the first to formalize the definition for hypergames. As a demonstration of the new temporal framework for hypergames, it is applied to classical game theoretical examples, as well as a complex supervisory control and data acquisition (SCADA) network temporal hypergame. The SCADA network is an example includes actions that have a temporal dependency, where a choice in the first round affects what decisions can be made in the later round of the game. The demonstration results show that the framework is a realistic and flexible modeling method for a variety of applications
- …