7,293 research outputs found

    Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games

    Full text link
    With the recent advances in solving large, zero-sum extensive form games, there is a growing interest in the inverse problem of inferring underlying game parameters given only access to agent actions. Although a recent work provides a powerful differentiable end-to-end learning frameworks which embed a game solver within a deep-learning framework, allowing unknown game parameters to be learned via backpropagation, this framework faces significant limitations when applied to boundedly rational human agents and large scale problems, leading to poor practicality. In this paper, we address these limitations and propose a framework that is applicable for more practical settings. First, seeking to learn the rationality of human agents in complex two-player zero-sum games, we draw upon well-known ideas in decision theory to obtain a concise and interpretable agent behavior model, and derive solvers and gradients for end-to-end learning. Second, to scale up to large, real-world scenarios, we propose an efficient first-order primal-dual method which exploits the structure of extensive-form games, yielding significantly faster computation for both game solving and gradient computation. When tested on randomly generated games, we report speedups of orders of magnitude over previous approaches. We also demonstrate the effectiveness of our model on both real-world one-player settings and synthetic data

    Considering Human Aspects on Strategies for Designing and Managing Distributed Human Computation

    Full text link
    A human computation system can be viewed as a distributed system in which the processors are humans, called workers. Such systems harness the cognitive power of a group of workers connected to the Internet to execute relatively simple tasks, whose solutions, once grouped, solve a problem that systems equipped with only machines could not solve satisfactorily. Examples of such systems are Amazon Mechanical Turk and the Zooniverse platform. A human computation application comprises a group of tasks, each of them can be performed by one worker. Tasks might have dependencies among each other. In this study, we propose a theoretical framework to analyze such type of application from a distributed systems point of view. Our framework is established on three dimensions that represent different perspectives in which human computation applications can be approached: quality-of-service requirements, design and management strategies, and human aspects. By using this framework, we review human computation in the perspective of programmers seeking to improve the design of human computation applications and managers seeking to increase the effectiveness of human computation infrastructures in running such applications. In doing so, besides integrating and organizing what has been done in this direction, we also put into perspective the fact that the human aspects of the workers in such systems introduce new challenges in terms of, for example, task assignment, dependency management, and fault prevention and tolerance. We discuss how they are related to distributed systems and other areas of knowledge.Comment: 3 figures, 1 tabl

    Enhancing automated red teaming with Monte Carlo Tree Search

    Get PDF
    This study has investigated novel Automated Red Teaming methods that support replanning. Traditional Automated Red Teaming (ART) approaches usually use evolutionary computing methods for evolving plans using simulations. A drawback of this method is the inability to change a team’s strategy part way through a simulation. This study focussed on a Monte-Carlo Tree Search (MCTS) method in an ART environment that supports re-planning to lead to better strategy decisions and a higher average scor

    Using Double Oracle Algorithm for Classification of Adversarial Actions

    Get PDF
    Diplomová práce se zabývá použitím algoritmu inkrementálního generování strategií v nekonečných hrách. Konkrétně se zaměřuje na jeho využití při klasifikaci akcí útočníka. Nejprve jsme si formalizovali problém adversariální klasifikace jako hru se strikt\-ním omezením na chybu prvního typu v prostoru smířených strategií, která je téměř s nulovým součtem. K této reprezentaci jsme vytvořili algoritmus, který nám přesně určí hodnotu hry. Algoritmus inkrementálního generování strategií se v tomto případě skládá ze tří částí: z lehce upraveného LP na řešení omezené hry, z obecné optimalizační funkce pro nalezení optimální reakce útočníka a z klasifikátoru, který přibližně hledá optimální reakci obránce. Vytvořili jsme framework používající algoritmus inkrementálního generování strategií pro řešení problému klasifikace akcí útočníka a otestovali jsme ho na doménách s různorodou strukturou a~s~různě dimenzionálním prostorem akcí útočníka. Experimenty využívaly tři různé klasifikátory: rozhodovací stromy, SVM a neuronové sítě. Výsledky ukázaly, že algoritmus konverguje, ale jeho časová náročnost rapidně roste s počtem dimenzí prostoru útočníkových akcí.This thesis examines the usability of Double-Oracle algorithm for finding a~Nash equilibrium in infinite games. Especially, it focuses on finding a robust solution for classification of adversarial action. At first, we have formalized an adversarial classification problem as an almost zero-sum game with hard false-positive constraint in expectation. For this representation, we have found an algorithm, which gives us the exact value of the game. Double Oracle applied in this game consists of three parts: slightly modified LP for solving the restricted game, general optimization for finding the attacker's best response, and a classifier for an approximation of the defender's best response. We have created a framework for using DO for classification of adversarial actions, and we have evaluated it on predefined domains with various structures and a~various number of dimensions. The experiments have been performed with three classifier types: decision tree, SVM, and neural network. The experimental results have shown that the algorithm converges, but the computation time grows fast with the number of dimensions
    • …
    corecore