7,293 research outputs found
Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games
With the recent advances in solving large, zero-sum extensive form games,
there is a growing interest in the inverse problem of inferring underlying game
parameters given only access to agent actions. Although a recent work provides
a powerful differentiable end-to-end learning frameworks which embed a game
solver within a deep-learning framework, allowing unknown game parameters to be
learned via backpropagation, this framework faces significant limitations when
applied to boundedly rational human agents and large scale problems, leading to
poor practicality. In this paper, we address these limitations and propose a
framework that is applicable for more practical settings. First, seeking to
learn the rationality of human agents in complex two-player zero-sum games, we
draw upon well-known ideas in decision theory to obtain a concise and
interpretable agent behavior model, and derive solvers and gradients for
end-to-end learning. Second, to scale up to large, real-world scenarios, we
propose an efficient first-order primal-dual method which exploits the
structure of extensive-form games, yielding significantly faster computation
for both game solving and gradient computation. When tested on randomly
generated games, we report speedups of orders of magnitude over previous
approaches. We also demonstrate the effectiveness of our model on both
real-world one-player settings and synthetic data
Considering Human Aspects on Strategies for Designing and Managing Distributed Human Computation
A human computation system can be viewed as a distributed system in which the
processors are humans, called workers. Such systems harness the cognitive power
of a group of workers connected to the Internet to execute relatively simple
tasks, whose solutions, once grouped, solve a problem that systems equipped
with only machines could not solve satisfactorily. Examples of such systems are
Amazon Mechanical Turk and the Zooniverse platform. A human computation
application comprises a group of tasks, each of them can be performed by one
worker. Tasks might have dependencies among each other. In this study, we
propose a theoretical framework to analyze such type of application from a
distributed systems point of view. Our framework is established on three
dimensions that represent different perspectives in which human computation
applications can be approached: quality-of-service requirements, design and
management strategies, and human aspects. By using this framework, we review
human computation in the perspective of programmers seeking to improve the
design of human computation applications and managers seeking to increase the
effectiveness of human computation infrastructures in running such
applications. In doing so, besides integrating and organizing what has been
done in this direction, we also put into perspective the fact that the human
aspects of the workers in such systems introduce new challenges in terms of,
for example, task assignment, dependency management, and fault prevention and
tolerance. We discuss how they are related to distributed systems and other
areas of knowledge.Comment: 3 figures, 1 tabl
Enhancing automated red teaming with Monte Carlo Tree Search
This study has investigated novel Automated Red Teaming methods that support replanning. Traditional Automated Red Teaming (ART) approaches usually use evolutionary computing methods for evolving plans using simulations. A drawback of this method is the inability to change a team’s strategy part way through a simulation. This study focussed on a Monte-Carlo Tree Search (MCTS) method in an ART environment that supports re-planning to lead to better strategy decisions and a higher average scor
Using Double Oracle Algorithm for Classification of Adversarial Actions
Diplomová práce se zabĂ˝vá pouĹľitĂm algoritmu inkrementálnĂho generovánĂ strategiĂ v nekoneÄŤnĂ˝ch hrách. KonkrĂ©tnÄ› se zaměřuje na jeho vyuĹľitĂ pĹ™i klasifikaci akcĂ ĂştoÄŤnĂka. Nejprve jsme si formalizovali problĂ©m adversariálnĂ klasifikace jako hru se strikt\-nĂm omezenĂm na chybu prvnĂho typu v prostoru smĂĹ™enĂ˝ch strategiĂ, která je tĂ©měř s nulovĂ˝m souÄŤtem. K tĂ©to reprezentaci jsme vytvoĹ™ili algoritmus, kterĂ˝ nám pĹ™esnÄ› urÄŤĂ hodnotu hry. Algoritmus inkrementálnĂho generovánĂ strategiĂ se v tomto pĹ™ĂpadÄ› skládá ze třà částĂ: z lehce upravenĂ©ho LP na Ĺ™ešenĂ omezenĂ© hry, z obecnĂ© optimalizaÄŤnĂ funkce pro nalezenĂ optimálnĂ reakce ĂştoÄŤnĂka a z klasifikátoru, kterĂ˝ pĹ™ibliĹľnÄ› hledá optimálnĂ reakci obránce. VytvoĹ™ili jsme framework pouĹľĂvajĂcĂ algoritmus inkrementálnĂho generovánĂ strategiĂ pro Ĺ™ešenĂ problĂ©mu klasifikace akcĂ ĂştoÄŤnĂka a otestovali jsme ho na domĂ©nách s rĹŻznorodou strukturou a~s~rĹŻznÄ› dimenzionálnĂm prostorem akcĂ ĂştoÄŤnĂka. Experimenty vyuĹľĂvaly tĹ™i rĹŻznĂ© klasifikátory: rozhodovacĂ stromy, SVM a neuronovĂ© sĂtÄ›. VĂ˝sledky ukázaly, Ĺľe algoritmus konverguje, ale jeho ÄŤasová nároÄŤnost rapidnÄ› roste s poÄŤtem dimenzĂ prostoru ĂştoÄŤnĂkovĂ˝ch akcĂ.This thesis examines the usability of Double-Oracle algorithm for finding a~Nash equilibrium in infinite games. Especially, it focuses on finding a robust solution for classification of adversarial action. At first, we have formalized an adversarial classification problem as an almost zero-sum game with hard false-positive constraint in expectation. For this representation, we have found an algorithm, which gives us the exact value of the game. Double Oracle applied in this game consists of three parts: slightly modified LP for solving the restricted game, general optimization for finding the attacker's best response, and a classifier for an approximation of the defender's best response. We have created a framework for using DO for classification of adversarial actions, and we have evaluated it on predefined domains with various structures and a~various number of dimensions. The experiments have been performed with three classifier types: decision tree, SVM, and neural network. The experimental results have shown that the algorithm converges, but the computation time grows fast with the number of dimensions
- …