4,921 research outputs found
Communication in Turn Based Multiplayer Games Using Deep Reinforcement Learning
This work investigates communication in cooperative settings of multi-agent reinforcement learning. We look at what conditions make it easier or harder for meaningful communication to arise between the agents. This includes introducing and showing the usefulness of learning biases in a discrete or continuous setting. In order to do this we extend the game of Negotiation to a continuous setting, introduce a new environment called Sequence Guess, and introduce a new learning bias that helps facilitate the emergence of communication in a continuous setting.Masteroppgave i informatikkINF399MAMN-INFMAMN-PRO
An Agent-based Modelling Framework for Driving Policy Learning in Connected and Autonomous Vehicles
Due to the complexity of the natural world, a programmer cannot foresee all
possible situations, a connected and autonomous vehicle (CAV) will face during
its operation, and hence, CAVs will need to learn to make decisions
autonomously. Due to the sensing of its surroundings and information exchanged
with other vehicles and road infrastructure, a CAV will have access to large
amounts of useful data. While different control algorithms have been proposed
for CAVs, the benefits brought about by connectedness of autonomous vehicles to
other vehicles and to the infrastructure, and its implications on policy
learning has not been investigated in literature. This paper investigates a
data driven driving policy learning framework through an agent-based modelling
approaches. The contributions of the paper are two-fold. A dynamic programming
framework is proposed for in-vehicle policy learning with and without
connectivity to neighboring vehicles. The simulation results indicate that
while a CAV can learn to make autonomous decisions, vehicle-to-vehicle (V2V)
communication of information improves this capability. Furthermore, to overcome
the limitations of sensing in a CAV, the paper proposes a novel concept for
infrastructure-led policy learning and communication with autonomous vehicles.
In infrastructure-led policy learning, road-side infrastructure senses and
captures successful vehicle maneuvers and learns an optimal policy from those
temporal sequences, and when a vehicle approaches the road-side unit, the
policy is communicated to the CAV. Deep-imitation learning methodology is
proposed to develop such an infrastructure-led policy learning framework
No Press Diplomacy
Ce mémoire présente un article sur un agent pouvant jouer à la version "No-Press" (sans messages) du jeu de société Diplomacy. Diplomacy est un jeu de négociation à 7 joueurs où chacun des joueurs essaie de conquérir la majorité des centres d’approvisionnement d’Europe au début du 20e siècle.
L’article présente, en premier lieu, un ensemble de données contenant plus de 150 000 jeux joués par des humains. Cet ensemble de données a été compilé suite à la signature d’un partenariat avec un site externe. Les jeux, qui ont été joués sur cette plateforme, ont tous été convertis dans un nouveau format standardizé et ont ensuite été rejoués pour s’assurer de leur qualité. L’article présente aussi un engin de jeu, avec une interface web, permettant à des humains de jouer contre les modèles qui ont été développés.
De plus, l’article présente un modèle d’apprentissage supervisé où l’agent apprend à reproduire le comportement de tous les joueurs dans l’ensemble de données par maximum de vraisemblance. Un agent qui apprend à jouer par renforcement (en jouant contre lui-même) a aussi été entraîné. L’article se conclut en faisant une analyse de ces modèles et en comparant la performance des agents contre des agents utilisant des règles complexes.This thesis presents an article on an agent which can play the "No-Press" version (without messages) of the Diplomacy board game. Diplomacy is a 7-player negotiation game where each player tries to conquer the majority of the supply centers in Europe at the beginning of the 20th century.
The article first presents a novel dataset of more than 150 000 human games. This dataset was compiled following the signing of a partnership with an external site. The games, which were played on this platform, were all converted into a new standardized format and then replayed to ensure their quality. The article also presents a game engine, with a web interface, allowing humans to play against the models that have been trained.
Moreover, the article presents a supervised learning model where an agent learns to reproduce the behavior of all players in the dataset by maximum likelihood. An agent that learns by reinforcement (by playing games against itself) has also been trained. The article concludes by doing an analysis of these models and comparing their performance against complex rule-based agents
- …