270 research outputs found

    A Grey-Box Approach to Automated Mechanism Design

    Get PDF
    Auctions play an important role in electronic commerce, and have been used to solve problems in distributed computing. Automated approaches to designing effective auction mechanisms are helpful in reducing the burden of traditional game theoretic, analytic approaches and in searching through the large space of possible auction mechanisms. This paper presents an approach to automated mechanism design (AMD) in the domain of double auctions. We describe a novel parametrized space of double auctions, and then introduce an evolutionary search method that searches this space of parameters. The approach evaluates auction mechanisms using the framework of the TAC Market Design Game and relates the performance of the markets in that game to their constituent parts using reinforcement learning. Experiments show that the strongest mechanisms we found using this approach not only win the Market Design Game against known, strong opponents, but also exhibit desirable economic properties when they run in isolation.Comment: 18 pages, 2 figures, 2 tables, and 1 algorithm. Extended abstract to appear in the proceedings of AAMAS'201

    Stable Profiles in Simulation-Based Games via Reinforcement Learning and Statistics

    Full text link
    In environments governed by the behavior of strategically interacting agents, game theory provides a way to predict outcomes in counterfactual scenarios, such as new market mechanisms or cybersecurity systems. Simulation-based games allow analysts to reason about settings that are too complex to model analytically with sufficient fidelity. But prior techniques for studying agent behavior in simulation-based games lack theoretical guarantees about the strategic stability of these behaviors. In this dissertation, I propose a way to measure the likelihood an agent could find a beneficial strategy deviation from a proposed behavior, using a limited number of samples from a distribution over strategies, including a theoretically proven bound. This method employs a provably conservative confidence interval estimator, along with a multiple test correction, to provide its guarantee. I show that the method can reliably find provably stable strategy profiles in an auction game, and in a cybersecurity game from prior literature. I also present a method for evaluating the stability of strategy profiles learned over a restricted set of strategies, where a strategy profile is an assignment of a strategy to each agent in a game. This method uses reinforcement learning to challenge the learned behavior as a test of its soundness. This study finds that a widely-used trading agent model, the zero-intelligence trader, can be reasonably strategically stable in continuous double auction games, but only if the strategies have their parameters calibrated for the particular game instance. In addition, I present new applications of empirical game-theoretic analysis (EGTA) to a cybersecurity setting, involving defense against attacker intrusion into a computer system. This work uses iterated deep reinforcement learning to generate more strategically stable attacker and defender strategies, relative to those found in prior work. It also offers empirical insights into how iterated deep reinforcement learning approaches strategic equilibrium, over dozens of rounds.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/149991/1/masondw_1.pd

    Automated Auction Mechanism Design with Competing Markets

    Full text link
    Resource allocation is a major issue in multiple areas of computer science. Despite the wide range of resource types across these areas, for example real commodities in e-commerce and computing resources in distributed computing, auctions are commonly used in solving the optimization problems involved in these areas, since well designed auctions achieve desirable economic outcomes. Auctions are markets with strict regulations governing the information available to traders in the market and the possible actions they can take. Auction mechanism design aims to manipulate the rules of an auction in order to achieve specific goals. Economists traditionally use mathematical methods, mainly game theory, to analyze auctions and design new auction forms. However, due to the high complexity of auctions, the mathematical models are typically simplified to obtain results, and this makes it difficult to apply results derived from such models to market environments in the real world. As a result, researchers are turning to empirical approaches. Following this line of work, we present what we call a grey-box approach to automated auction mechanism design using reinforcement learning and evolutionary computation methods. We first describe a new strategic game, called \cat, which were designed to run multiple markets that compete to attract traders and make profit. The CAT game enables us to address the imbalance between prior work in this field that studied auctions in an isolated environment and the actual competitive situation that markets face. We then define a novel, parameterized framework for auction mechanisms, and present a classification of auction rules with each as a building block fitting into the framework. Finally we evaluate the viability of building blocks, and acquire auction mechanisms by combining viable blocks through iterations of CAT games. We carried out experiments to examine the effectiveness of the grey-box approach. The best mechanisms we learnt were able to outperform the standard mechanisms against which learning took place and carefully hand-coded mechanisms which won tournaments based on the CAT game. These best mechanisms were also able to outperform mechanisms from the literature even when the evaluation did not take place in the context of CAT games. These results suggest that the grey-box approach can generate robust double auction mechanisms and, as a consequence, is an effective approach to automated mechanism design. The contributions of this work are two-fold. First, the grey-box approach helps to design better auction mechanisms which can play a central role in solutions to resource allocation problems in various application domains of computer science. Second, the parameterized view and the reinforcement learning-based search method can be used in other strategic, competitive situations where decision making processes are complex and difficult to design and evaluate manually

    Bounds and dynamics for empirical game theoretic analysis

    Get PDF
    This paper provides several theoretical results for empirical game theory. Specifically, we introduce bounds for empirical game theoretical analysis of complex multi-agent interactions. In doing so we provide insights in the empirical meta game showing that a Nash equilibrium of the estimated meta-game is an approximate Nash equilibrium of the true underlying meta-game. We investigate and show how many data samples are required to obtain a close enough approximation of the underlying game. Additionally, we extend the evolutionary dynamics analysis of meta-games using heuristic payoff tables (HPTs) to asymmetric games. The state-of-the-art has only considered evolutionary dynamics of symmetric HPTs in which agents have access to the same strategy sets and the payoff structure is symmetric, implying that agents are interchangeable. Finally, we carry out an empirical illustration of the generalised method in several domains, illustrating the theory and evolutionary dynamics of several versions of the AlphaGo algorithm (symmetric), the dynamics of the Colonel Blotto game played by human players on Facebook (symmetric), the dynamics of several teams of players in the capture the flag game (symmetric), and an example of a meta-game in Leduc Poker (asymmetric), generated by the policy-space response oracle multi-agent learning algorithm

    Agent-based Modeling And Market Microstructure

    Get PDF
    In most modern financial markets, traders express their preferences for assets by making orders. These orders are either executed if a counterparty is willing to match them or collected in a priority queue, called a limit order book. Such markets are said to adopt an order-driven trading mechanism. A key question in this domain is to model and analyze the strategic behavior of market participants, in response to different definitions of the trading mechanism (e.g., the priority queue changed from the continuous double auctions to the frequent call market). The objective is to design financial markets where pernicious behavior is minimized.The complex dynamics of market activities are typically studied via agent-based modeling (ABM) methods, enriched by Empirical Game-Theoretic Analysis (EGTA) to compute equilibria amongst market players and highlight the market behavior (also known as market microstructure) at equilibrium. This thesis contributes to this research area by evaluating the robustness of this approach and providing results to compare existing trading mechanisms and propose more advanced designs.In Chapter 4, we investigate the equilibrium strategy profiles, including their induced market performance, and their robustness to different simulation parameters. For two mainstream trading mechanisms, continuous double auctions (CDAs) and frequent call markets (FCMs), we find that EGTA is needed for solving the game as pure strategies are not a good approximation of the equilibrium. Moreover, EGTA gives generally sound and robust solutions regarding different market and model setups, with the notable exception of agents’ risk attitudes. We also consider heterogeneous EGTA, a more realistic generalization of EGTA whereby traders can modify their strategies during the simulation, and show that fixed strategies lead to sufficiently good analyses, especially taking the computation cost into consideration.After verifying the reliability of the ABM and EGTA methods, we follow this research methodology to study the impact of two widely adopted and potentially malicious trading strategies: spoofing and submission of iceberg orders. In Chapter 5, we study the effects of spoofing attacks on CDA and FCM markets. We let one spoofer (agent playing the spoofing strategy) play with other strategic agents and demonstrate that while spoofing may be profitable in both market models, it has less impact on FCMs than on CDAs. We also explore several FCM mechanism designs to help curb this type of market manipulation even further. In Chapter 6, we study the impact of iceberg orders on the price and order flow dynamics in financial markets. We find that the volume of submitted orders significantly affects the strategy choice of the other agents and the market performance. In general, when agents observe a large volume order, they tend to speculate instead of providing liquidity. In terms of market performance, both efficiency and liquidity will be harmed. We show that while playing the iceberg-order strategy can alleviate the problem caused by the high-volume orders, submitting a large enough order and attracting speculators is better than taking the risk of having fewer trades executed with iceberg orders.We conclude from Chapters 5 and 6 that FCMs have some exciting features when compared with CDAs and focus on the design of trading mechanisms in Chapter 7. We verify that CDAs constitute fertile soil for predatory behavior and toxic order flows and that FCMs address the latency arbitrage opportunities built in those markets. This chapter studies the extent to which adaptive rules to define the length of the clearing intervals — that might move in sync with the market fundamentals — affect the performance of frequent call markets. We show that matching orders in accordance with these rules can increase efficiency and selfish traders’ surplus in a variety of market conditions. In so doing, our work paves the way for a deeper understanding of the flexibility granted by adaptive call markets

    Offline Equilibrium Finding

    Full text link
    Offline reinforcement learning (Offline RL) is an emerging field that has recently begun gaining attention across various application domains due to its ability to learn behavior from earlier collected datasets. Using logged data is imperative when further interaction with the environment is expensive (computationally or otherwise), unsafe, or entirely unfeasible. Offline RL proved very successful, paving a path to solving previously intractable real-world problems, and we aim to generalize this paradigm to a multi-agent or multiplayer-game setting. Very little research has been done in this area, as the progress is hindered by the lack of standardized datasets and meaningful benchmarks. In this work, we coin the term offline equilibrium finding (OEF) to describe this area and construct multiple datasets consisting of strategies collected across a wide range of games using several established methods. We also propose a benchmark method -- an amalgamation of a behavior-cloning and a model-based algorithm. Our two model-based algorithms -- OEF-PSRO and OEF-CFR -- are adaptations of the widely-used equilibrium finding algorithms Deep CFR and PSRO in the context of offline learning. In the empirical part, we evaluate the performance of the benchmark algorithms on the constructed datasets. We hope that our efforts may help to accelerate research in large-scale equilibrium finding. Datasets and code are available at https://github.com/SecurityGames/oef

    Learning to Manipulate a Financial Benchmark

    Get PDF
    Financial benchmarks estimate market values or reference rates used in a wide variety of contexts, but are often calculated from data generated by parties who have incentives to manipulate these benchmarks. Since the London Interbank Offered Rate (LIBOR) scandal in 2011, market participants, scholars, and regulators have scrutinized financial benchmarks and the ability of traders to manipulate them. We study the impact on market welfare of manipulating transaction-based benchmarks in a simulated market environment. Our market consists of a single benchmark manipulator with external holdings dependent on the benchmark, and numerous background traders unaffected by the benchmark. We explore two types of manipulative trading strategies: zero-intelligence strategies and strategies generated by deep reinforcement learning. Background traders use zero-intelligence trading strategies. We find that the total surplus of all market participants who are trading increases with manipulation. However, the aggregated market surplus decreases for all trading agents, and the market surplus of the manipulator decreases, so the manipulator’s surplus from the benchmark significantly increases. This entails under natural assumptions that the market and any third parties invested in the opposite side of the benchmark from the manipulator are negatively impacted by this manipulation

    Evolutionary Mechanism Design

    Get PDF
    The advent of large-scale distributed systems poses unique engineering challenges. In open systems such as the internet it is not possible to prescribe the behaviour of all of the components of the system in advance. Rather, we attempt to design infrastructure, such as network protocols, in such a way that the overall system is robust despite the fact that numerous arbitrary, non-certified, third-party components can connect to our system. Economists have long understood this issue, since it is analogous to the design of the rules governing auctions and other marketplaces, in which we attempt to achieve sociallydesirable outcomes despite the impossibility of prescribing the exact behaviour of the market participants, who may attempt to subvert the market for their own personal gain. This field is known as 'mechanism design': the science of designing rules of a game to achieve a specific outcome, even though each participant may be self-interested. Although it originated in economics, mechanism design has become an important foundation of multi-agent systems (MAS) research. In many scenarios mechanism design and auction theory yield clear-cut results; however, there are many situations in which the underlying assumptions of the theory are violated due to the messiness of the real-world. In this thesis I introduce an evolutionary methodology for mechanism design, which is able to incorporate arbitrary design objectives and domain assumptions, and I validate the methodology using empirical techniques

    A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

    Get PDF
    To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL). The simplest form is independent reinforcement learning (InRL), where each agent treats its experience as part of its (non-stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to the other agents’ policies during training, failing to sufficiently generalize duringn execution. We introduce a new metric, joint-policy correlation, to quantify this effect. We describe an algorithm for general MARL, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection. The algorithm generalizes previous ones such as InRL, iterated best response, double oracle, and fictitious play. Then, we present a scalable implementation which reduces the memory requirement using decoupled meta-solvers. Finally, we demonstrate the generality of the resulting policies in two partially observable settings: gridworld coordination games and poker

    Automated Bidding in Computing Service Markets. Strategies, Architectures, Protocols

    Get PDF
    This dissertation contributes to the research on Computational Mechanism Design by providing novel theoretical and software models - a novel bidding strategy called Q-Strategy, which automates bidding processes in imperfect information markets, a software framework for realizing agents and bidding strategies called BidGenerator and a communication protocol called MX/CS, for expressing and exchanging economic and technical information in a market-based scheduling system
    • …
    corecore