184 research outputs found

    Estimating the strength of poker hands by integer linear programming techniques

    Get PDF
    We illustrate how Integer Linear Programming techniques can be applied to the popular game of poker Texas Hold'em in order to evaluate the strength of a hand. In particular, we give models aimed at (i) minimizing the number of features that a player should look at when estimating his winning probability (called his {em equity}); (ii) giving weights to such features so that the equity is approximated by the weighted sum of the selected features. We show that ten features or less are enough to estimate the equity of a hand with high precision

    Opponent Modelling in Multi-Agent Systems

    Get PDF
    Reinforcement Learning (RL) formalises a problem where an intelligent agent needs to learn and achieve certain goals by maximising a long-term return in an environment. Multi-agent reinforcement learning (MARL) extends traditional RL to multiple agents. Many RL algorithms lose convergence guarantee in non-stationary environments due to the adaptive opponents. Partial observation caused by agents’ different private observations introduces high variance during the training which exacerbates the data inefficiency. In MARL, training an agent to perform well against a set of opponents often leads to bad performance against another set of opponents. Non-stationarity, partial observation and unclear learning objective are three critical problems in MARL which hinder agents’ learning and they all share a cause which is the lack of knowledge of the other agents. Therefore, in this thesis, we propose to solve these problems with opponent modelling methods. We tailor our solutions by combining opponent modelling with other techniques according to the characteristics of problems we face. Specifically, we first propose ROMMEO, an algorithm inspired by Bayesian inference, as a solution to alleviate the non-stationarity in cooperative games. Then we study the partial observation problem caused by agents’ private observation and design an implicit communication training method named PBL. Lastly, we investigate solutions to the non-stationarity and unclear learning objective problems in zero-sum games. We propose a solution named EPSOM which aims for finding safe exploitation strategies to play against non-stationary opponents. We verify our proposed methods by varied experiments and show they can achieve the desired performance. Limitations and future works are discussed in the last chapter of this thesis

    Rule based strategies for large extensive-form games: A specification language for No-Limit Texas Hold'em agents

    Get PDF
    Poker is used to measure progresses in extensive-form games research due to its unique characteristics: it is a game where playing agents have to deal with incomplete information and stochastic scenarios and a large number of decision points. The development of Poker agents has seen significant advances in one-on-one matches but there are still no consistent results in multiplayer and in games against human experts. In order to allow for experts to aid the improvement of the agents' performance, we have created a high-level strategy specification language. To support strategy definition, we have also developed an intuitive graphical tool. Additionally, we have also created a strategy inferring system, based on a dynamically weighted Euclidean distance. This approach was validated through the creation of simple agents and by successfully inferring strategies from 10 human players. The created agents were able to beat previously developed mid-level agents by a good profit margin

    Extensible graphical game generator

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2000.Vita.Includes bibliographical references (leaves 162-167).An ontology of games was developed, and the similarities between games were analyzed and codified into reusable software components in a system called EGGG, the Extensible Graphical Game Generator. By exploiting the similarities between games, EGGG makes it possible for someone to create a fully functional computer game with a minimum of programming effort. The thesis behind the dissertation is that there exist sufficient commonalities between games that such a software system can be constructed. In plain English, the thesis is that games are really a lot more alike than most people imagine, and that these similarities can be used to create a generic game engine: you tell it the rules of your game, and the engine renders it into an actual computer game that everyone can play.by Jon Orwant.Ph.D

    Fuzzy Operator Trees for Modeling Utility Functions

    Get PDF
    In this thesis, we propose a method for modeling utility (rating) functions based on a novel concept called textbf{Fuzzy Operator Tree} (FOT for short). As the notion suggests, this method makes use of techniques from fuzzy set theory and implements a fuzzy rating function, that is, a utility function that maps to the unit interval, where 00 corresponds to the lowest and 11 to the highest evaluation. Even though the original motivation comes from quality control, FOTs are completely general and widely applicable. Our approach allows a human expert to specify a model in the form of an FOT in a quite convenient and intuitive way. To this end, he simply has to split evaluation criteria into sub-criteria in a recursive manner, and to determine in which way these sub-criteria ought to be combined: conjunctively, disjunctively, or by means of an averaging operator. The result of this process is the qualitative structure of the model. A second step, then, it is to parameterize the model. To support or even free the expert form this step, we develop a method for calibrating the model on the basis of exemplary ratings, that is, in a purely data-driven way. This method, which makes use of optimization techniques from the field of evolutionary algorithms, constitutes the second major contribution of the thesis. The third contribution of the thesis is a method for evaluating an FOT in a cost-efficient way. Roughly speaking, an FOT can be seen as an aggregation function that combines the evaluations of a number of basic criteria into an overall rating of an object. Essentially, the cost of computing this rating is hence given by sum of the evaluation costs of the basic criteria. In practice, however, the precise utility degree is often not needed. Instead, it is enough to know whether it lies above or below an important threshold value. In such cases, the evaluation process, understood as a sequential evaluation of basic criteria, can be stopped as soon as this question can be answered in a unique way. Of course, the (expected) number of basic criteria and, therefore, the (expected) evaluation cost will then strongly depend on the order of the evaluations, and this is what is optimized by the methods that we have developed

    Automatic continuous testing to speed software development

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 147-152).Continuous testing is a new feature for software development environments that uses excess cycles on a developer's workstation to continuously run regression tests in the background, providing rapid feedback about test failures as source code is edited. It is intended to reduce the time and energy required to keep code well-tested, and to prevent regression errors from persisting uncaught for long periods of time. The longer that regression errors are allowed to linger during development, the more time is wasted debugging and fixing them once they are discovered. By monitoring and measuring software projects, we estimate that the wasted time, consisting of this preventable extra fixing cost added to the time spent running tests and waiting for them to complete, accounts for 10-15% of total development time. We present a model of developer behavior that uses data from past projects to infer developer beliefs and predict behavior in new environments -in particular, when changing testing methodologies or tools to reduce wasted time. This model predicts that continuous testing would reduce wasted time by 92-98%, a substantial improvement over other approaches we evaluated, such as automatic test prioritization and changing manual test frequencies. A controlled human experiment indicates that student developers using continuous testing were three times more likely to complete a task before the deadline than those without, with no significant effect on time worked.(cont.) Most participants found continuous testing to be useful and believed that it helped them write better code faster. 90% would recommend the tool to others. We show the first empirical evidence of a benefit from continuous compilation, a popular related feature. Continuous testing has been integrated into Emacs and Eclipse. We detail the functional and technical design of the Eclipse plug-in, which is publicly beta-released.by David Saff.S.M
    • …
    corecore