12,431 research outputs found

    Assessing the Potential of Classical Q-learning in General Game Playing

    Get PDF
    After the recent groundbreaking results of AlphaGo and AlphaZero, we have seen strong interests in deep reinforcement learning and artificial general intelligence (AGI) in game playing. However, deep learning is resource-intensive and the theory is not yet well developed. For small games, simple classical table-based Q-learning might still be the algorithm of choice. General Game Playing (GGP) provides a good testbed for reinforcement learning to research AGI. Q-learning is one of the canonical reinforcement learning methods, and has been used by (Banerjee &\& Stone, IJCAI 2007) in GGP. In this paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex)\footnote{source code: https://github.com/wh1992v/ggp-rl}, to allow comparison to Banerjee et al.. We find that Q-learning converges to a high win rate in GGP. For the ϵ\epsilon-greedy strategy, we propose a first enhancement, the dynamic ϵ\epsilon algorithm. In addition, inspired by (Gelly &\& Silver, ICML 2007) we combine online search (Monte Carlo Search) to enhance offline learning, and propose QM-learning for GGP. Both enhancements improve the performance of classical Q-learning. In this work, GGP allows us to show, if augmented by appropriate enhancements, that classical table-based Q-learning can perform well in small games.Comment: arXiv admin note: substantial text overlap with arXiv:1802.0594

    10 simple rules to create a serious game, illustrated with examples from structural biology

    Full text link
    Serious scientific games are games whose purpose is not only fun. In the field of science, the serious goals include crucial activities for scientists: outreach, teaching and research. The number of serious games is increasing rapidly, in particular citizen science games, games that allow people to produce and/or analyze scientific data. Interestingly, it is possible to build a set of rules providing a guideline to create or improve serious games. We present arguments gathered from our own experience ( Phylo , DocMolecules , HiRE-RNA contest and Pangu) as well as examples from the growing literature on scientific serious games

    Allocation in Practice

    Full text link
    How do we allocate scarcere sources? How do we fairly allocate costs? These are two pressing challenges facing society today. I discuss two recent projects at NICTA concerning resource and cost allocation. In the first, we have been working with FoodBank Local, a social startup working in collaboration with food bank charities around the world to optimise the logistics of collecting and distributing donated food. Before we can distribute this food, we must decide how to allocate it to different charities and food kitchens. This gives rise to a fair division problem with several new dimensions, rarely considered in the literature. In the second, we have been looking at cost allocation within the distribution network of a large multinational company. This also has several new dimensions rarely considered in the literature.Comment: To appear in Proc. of 37th edition of the German Conference on Artificial Intelligence (KI 2014), Springer LNC

    Open Problems in the Emergence and Evolution of Linguistic Communication: A Road-Map for Research

    Get PDF

    False-Name Manipulation in Weighted Voting Games is Hard for Probabilistic Polynomial Time

    Full text link
    False-name manipulation refers to the question of whether a player in a weighted voting game can increase her power by splitting into several players and distributing her weight among these false identities. Analogously to this splitting problem, the beneficial merging problem asks whether a coalition of players can increase their power in a weighted voting game by merging their weights. Aziz et al. [ABEP11] analyze the problem of whether merging or splitting players in weighted voting games is beneficial in terms of the Shapley-Shubik and the normalized Banzhaf index, and so do Rey and Rothe [RR10] for the probabilistic Banzhaf index. All these results provide merely NP-hardness lower bounds for these problems, leaving the question about their exact complexity open. For the Shapley--Shubik and the probabilistic Banzhaf index, we raise these lower bounds to hardness for PP, "probabilistic polynomial time", and provide matching upper bounds for beneficial merging and, whenever the number of false identities is fixed, also for beneficial splitting, thus resolving previous conjectures in the affirmative. It follows from our results that beneficial merging and splitting for these two power indices cannot be solved in NP, unless the polynomial hierarchy collapses, which is considered highly unlikely

    [Subject benchmark statement]: computing

    Get PDF
    corecore