7 research outputs found
Multi-agent Monte Carlo go
In this paper we propose a Multi-Agent version of UCT Monte Carlo Go. We use the emergent behavior of a great number of simple agents to increase the quality of the Monte Carlo simulations, increasing the strength of the artificial player as a whole. Instead of one agent playing against itself, different agents play in the simulation phase of the algorithm, leading to a better exploration of the search space. We could significantly overcome Fuego, a top Computer Go software. Emergent behavior seems to be the next step of Computer Go development
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
Exploiting Opponent Modeling For Learning In Multi-agent Adversarial Games
An issue with learning effective policies in multi-agent adversarial games is that the size of the search space can be prohibitively large when the actions of both teammates and opponents are considered simultaneously. Opponent modeling, predicting an opponent’s actions in advance of execution, is one approach for selecting actions in adversarial settings, but it is often performed in an ad hoc way. In this dissertation, we introduce several methods for using opponent modeling, in the form of predictions about the players’ physical movements, to learn team policies. To explore the problem of decision-making in multi-agent adversarial scenarios, we use our approach for both offline play generation and real-time team response in the Rush 2008 American football simulator. Simultaneously predicting the movement trajectories, future reward, and play strategies of multiple players in real-time is a daunting task but we illustrate how it is possible to divide and conquer this problem with an assortment of data-driven models. By leveraging spatio-temporal traces of player movements, we learn discriminative models of defensive play for opponent modeling. With the reward information from previous play matchups, we use a modified version of UCT (Upper Conference Bounds applied to Trees) to create new offensive plays and to learn play repairs to counter predicted opponent actions. iii In team games, players must coordinate effectively to accomplish tasks while foiling their opponents either in a preplanned or emergent manner. An effective team policy must generate the necessary coordination, yet considering all possibilities for creating coordinating subgroups is computationally infeasible. Automatically identifying and preserving the coordination between key subgroups of teammates can make search more productive by pruning policies that disrupt these relationships. We demonstrate that combining opponent modeling with automatic subgroup identification can be used to create team policies with a higher average yardage than either the baseline game or domain-specific heuristics
Tekoäly ja go-peli
Tässä tutkielmassa käsittelen, kuinka kiinalaista go-peliä voidaan pelata tietokoneella käyttäen erilaisia tekoälytekniikoita. Go-pelissä on suuri haarautumiskerroin eli pelitilanteissa on useimmiten mahdollista tehdä lukuisia eri siirtoja. Tämän aiheuttamat ongelmat ovat yksi syy siihen, että go-peliä pelaavat ohjelmat ovat vielä paljon huonompia kuin parhaat ihmispelaajat. Esittelen tutkielmassa muutamia tekniikoita, kuten Monte Carlo -puuhaku, joilla on päästy tämän hetken parhaisiin go-peliä pelaaviin tietokoneohjelmiin. Lisäksi käsittelen tekniikoita, joita on lisätty Monte Carlo -puuhakuun, jotta puuhakua saataisiin tehostettua toimimaan paremmin
SpoookyJS. Ein multiagentenbasiertes JavaScript-Framework zur flexiblen Implementation digitaler browserbasierter Brettspiele und spielübergreifender künstlicher Intelligenz
Künstliche Intelligenz in digitalen Spielen ist zumeist Anwendungsdomäne komplexer spielspezifischer Softwarelösungen mangelnder Erweiterbarkeit. Die vorliegende Arbeit beschäftigt sich mit der Konzeption und Realisierung des JavaScript-Frameworks SpoookyJS, das die vereinfachte Erstellung browserbasierter digitaler Brettspiele ermöglicht. Entwickelt als Multiagentensystem, bietet SpoookyJS künstliche Gegner in den umgesetzten Spielen und fungiert als Test- und Entwicklungsumgebung für die Forschung um spielübergreifende artifizielle Entscheidungsfindung
Recommended from our members
Decentralized Multiagent Coordination for Connected and Autonomous Vehicle Routing in Congested Networks
In 2017, the cost of congestion in the United States was around 305 billion dollars, and city-dwellers, on average, lost 1400 dollars while sitting 42 hours in traffic jams. Aiming for better mobility and more efficient utilization of the transportation network, emerging connected and autonomous vehicle (CAV) technologies and their communication capabilities can produce well-coordinated and more efficient routing behavior to dissipate the traffic rather uniformly throughout the network, resulting in slower travel times. Vehicle routing is among the most critical and challenging, yet unsolved, tasks in CAV research. Current routing strategies either rely on a centralized control system which can fail in scaling, or employ decentralized schemes that yield sub-optimal coordination and poor system performance. In addition, it is of great importance for the deployment of CAV technologies to understand the transportation systems behavior in a mixed environment with various levels of communication complexity, where CAVs and Non-CAVs coexist and interoperate. The routing problem in a multiagent system resembles a competitive congestion game. The decisions of one agent (in this case, a CAV) directly impacts the performance of the others. When the number of agents traversing the same transportation facility at the same time exceeds a certain threshold, bottlenecks may occur, and thus, higher travel times. Therefore, coordination between CAVs is key to avoiding such circumstances. This dissertation answers how and to what extent different routing optimization algorithms, under various levels of autonomy and communication capabilities, can increase the mobility of the transportation system. This work designs this system in a decentralized manner that scales linearly in achieving a social and system-level optimum. To realistically analyze this system, we investigated the coordination behavior of CAVs under (1) No Communication, (2) Minimal Communication, and (3) Extensive Communication. In the absence of connectivity between the CAVs, a learning-based approach has been implemented where each CAV optimizes its own route using a reinforcement learning technique and based on its prior experiences. This competitive game quickly overwhelms the system as the market penetration of CAVs surpasses the critical threshold range (50% to 75%), where the mobility improvements are the most significant, and beyond which the system performance degrades. Under minimal communication level, we assumed the CAVs share information regarding their location and speed with the rest of the CAVs in their communication cluster through a multi-hop network. Then, a coordination scheme was implemented where each CAV minimizes its travel time based on the limited information it receives. Results showed that this application can reduce system travel time by up to 20%. Additionally, the emergence of mobility benefits are shown to correlate with the CAV network characteristics through the lens of percolation theory. The results revealed that, for the mobility benefits to surface, at least 70% of the CAVs are required to form a communication cluster. Under an extensive communication capability, where the CAVs not only share their location and speed but also their preferred path to their destination, a reduction of up to 40% in system travel time was achieved for high levels of CAV market penetration and communication radius. Moreover, the improvement in mobility was proved to be highly associated with the uniform dissipation of traffic onto the network. These findings provide solid support to create evidence-driven frameworks to guide future CAV development and deployment in a decentralized and coordinated manner
Multi-agent Monte Carlo go
In this paper we propose a Multi-Agent version of UCT Monte Carlo Go. We use the emergent behavior of a great number of simple agents to increase the quality of the Monte Carlo simulations, increasing the strength of the artificial player as a whole. Instead of one agent playing against itself, different agents play in the simulation phase of the algorithm, leading to a better exploration of the search space. We could significantly overcome Fuego, a top Computer Go software. Emergent behavior seems to be the next step of Computer Go development