10,820 research outputs found

    Improving Search with Supervised Learning in Trick-Based Card Games

    Full text link
    In trick-taking card games, a two-step process of state sampling and evaluation is widely used to approximate move values. While the evaluation component is vital, the accuracy of move value estimates is also fundamentally linked to how well the sampling distribution corresponds the true distribution. Despite this, recent work in trick-taking card game AI has mainly focused on improving evaluation algorithms with limited work on improving sampling. In this paper, we focus on the effect of sampling on the strength of a player and propose a novel method of sampling more realistic states given move history. In particular, we use predictions about locations of individual cards made by a deep neural network --- trained on data from human gameplay - in order to sample likely worlds for evaluation. This technique, used in conjunction with Perfect Information Monte Carlo (PIMC) search, provides a substantial increase in cardplay strength in the popular trick-taking card game of Skat.Comment: Accepted for publication at AAAI-1

    The comparative advantage of government : a review

    Get PDF
    In theory, market failures are necessary but not sufficient conditions for justifying government intervention in the production of goods and services. Even without market failures, there might be a case for government intervention on the grounds of poverty reduction or merit goods (for example, mandatory elementary education and mandatory use of seatbelts in cars and of helmets on motorbikes). In every case, contends the author, a case for government intervention must first identify the particular market failure that prevents the private sector from producing the socially optimal quantity of the good or service. Second, it must select the intervention that will most improve welfare. Third, it must show that society will be better off as a result of government involvement--must show that the benefits will outweigh the costs. It is impossible to judge a priori whether or what type of government intervention is appropriate to a particular circumstance or even to a class of situations. Such judgments are both country- and situation-specific and must be made on a case-by-case basis. To be sure, it is easier to make such judgments about market failures based on externalities, public goods, and so on, than about the market failures based on imperfect information. Market failures rooted in incomplete markets and imperfect information are pervasive: Markets are almost always incomplete, and information is always imperfect. This does not mean that there is always a case for government intervention and that further analysis is unnecessary. On the contrary, there is a keener need for analysis. The welfare consequences of the"new market failures"are more difficult to measure so government intervention's contribution to welfare is likely to be more difficult to assess and the case for intervention (especially the provision of goods and services) is more difficult to make. One must also keep in mind that government interventions are often poorly designed and overcostly. Poorly designed interventions may create market failures of their own. Governments concerned about low private investment in high-risk projects, for example, may guarantee them against risk but in the process create problems of moral hazard and induce investors to take no actions to mitigate such risks. And some interventions may turn out to be too costly relative to the posited benefits. In seeking to provide extension services, for example, governments may incur costs that are higher than the benefits farmers receive.Decentralization,Environmental Economics&Policies,Economic Theory&Research,Health Economics&Finance,Labor Policies,Health Economics&Finance,Banks&Banking Reform,Knowledge Economy,Environmental Economics&Policies,Economic Theory&Research

    A Bridge between Polynomial Optimization and Games with Imperfect Recall

    Full text link
    We provide several positive and negative complexity results for solving games with imperfect recall. Using a one-to-one correspondence between these games on one side and multivariate polynomials on the other side, we show that solving games with imperfect recall is as hard as solving certain problems of the first order theory of reals. We establish square root sum hardness even for the specific class of A-loss games. On the positive side, we find restrictions on games and strategies motivated by Bridge bidding that give polynomial-time complexity

    Pgx: Hardware-accelerated Parallel Game Simulators for Reinforcement Learning

    Full text link
    We propose Pgx, a suite of board game reinforcement learning (RL) environments written in JAX and optimized for GPU/TPU accelerators. By leveraging auto-vectorization and Just-In-Time (JIT) compilation of JAX, Pgx can efficiently scale to thousands of parallel executions over accelerators. In our experiments on a DGX-A100 workstation, we discovered that Pgx can simulate RL environments 10-100x faster than existing Python RL libraries. Pgx includes RL environments commonly used as benchmarks in RL research, such as backgammon, chess, shogi, and Go. Additionally, Pgx offers miniature game sets and baseline models to facilitate rapid research cycles. We demonstrate the efficient training of the Gumbel AlphaZero algorithm with Pgx environments. Overall, Pgx provides high-performance environment simulators for researchers to accelerate their RL experiments. Pgx is available at https://github.com/sotetsuk/pgx.Comment: 9 page

    BridgeHand2Vec Bridge Hand Representation

    Full text link
    Contract bridge is a game characterized by incomplete information, posing an exciting challenge for artificial intelligence methods. This paper proposes the BridgeHand2Vec approach, which leverages a neural network to embed a bridge player's hand (consisting of 13 cards) into a vector space. The resulting representation reflects the strength of the hand in the game and enables interpretable distances to be determined between different hands. This representation is derived by training a neural network to estimate the number of tricks that a pair of players can take. In the remainder of this paper, we analyze the properties of the resulting vector space and provide examples of its application in reinforcement learning, and opening bid classification. Although this was not our main goal, the neural network used for the vectorization achieves SOTA results on the DDBP2 problem (estimating the number of tricks for two given hands)
    corecore