Search CORE

4,921 research outputs found

Communication in Turn Based Multiplayer Games Using Deep Reinforcement Learning

Author: Villanger John Isak Fjellvang
Publication venue: The University of Bergen
Publication date: 01/01/2022
Field of study

This work investigates communication in cooperative settings of multi-agent reinforcement learning. We look at what conditions make it easier or harder for meaningful communication to arise between the agents. This includes introducing and showing the usefulness of learning biases in a discrete or continuous setting. In order to do this we extend the game of Negotiation to a continuous setting, introduce a new environment called Sequence Guess, and introduce a new learning bias that helps facilitate the emergence of communication in a continuous setting.Masteroppgave i informatikkINF399MAMN-INFMAMN-PRO

University of Bergen

NORA - Norwegian Open Research Archives

An Agent-based Modelling Framework for Driving Policy Learning in Connected and Autonomous Vehicles

Author: Dongyao Jia
G Pau
J Wang
N Lu
Riccardo Coppola
W Gao
Publication venue
Publication date: 23/08/2018
Field of study

Due to the complexity of the natural world, a programmer cannot foresee all possible situations, a connected and autonomous vehicle (CAV) will face during its operation, and hence, CAVs will need to learn to make decisions autonomously. Due to the sensing of its surroundings and information exchanged with other vehicles and road infrastructure, a CAV will have access to large amounts of useful data. While different control algorithms have been proposed for CAVs, the benefits brought about by connectedness of autonomous vehicles to other vehicles and to the infrastructure, and its implications on policy learning has not been investigated in literature. This paper investigates a data driven driving policy learning framework through an agent-based modelling approaches. The contributions of the paper are two-fold. A dynamic programming framework is proposed for in-vehicle policy learning with and without connectivity to neighboring vehicles. The simulation results indicate that while a CAV can learn to make autonomous decisions, vehicle-to-vehicle (V2V) communication of information improves this capability. Furthermore, to overcome the limitations of sensing in a CAV, the paper proposes a novel concept for infrastructure-led policy learning and communication with autonomous vehicles. In infrastructure-led policy learning, road-side infrastructure senses and captures successful vehicle maneuvers and learns an optimal policy from those temporal sequences, and when a vehicle approaches the road-side unit, the policy is communicated to the CAV. Deep-imitation learning methodology is proposed to develop such an infrastructure-led policy learning framework

arXiv.org e-Print Archive

Loughborough University Institutional Repository

Crossref

No Press Diplomacy

Author: Paquette Philip
Publication venue
Publication date: 01/08/2019
Field of study

Ce mémoire présente un article sur un agent pouvant jouer à la version "No-Press" (sans messages) du jeu de société Diplomacy. Diplomacy est un jeu de négociation à 7 joueurs où chacun des joueurs essaie de conquérir la majorité des centres d’approvisionnement d’Europe au début du 20e siècle. L’article présente, en premier lieu, un ensemble de données contenant plus de 150 000 jeux joués par des humains. Cet ensemble de données a été compilé suite à la signature d’un partenariat avec un site externe. Les jeux, qui ont été joués sur cette plateforme, ont tous été convertis dans un nouveau format standardizé et ont ensuite été rejoués pour s’assurer de leur qualité. L’article présente aussi un engin de jeu, avec une interface web, permettant à des humains de jouer contre les modèles qui ont été développés. De plus, l’article présente un modèle d’apprentissage supervisé où l’agent apprend à reproduire le comportement de tous les joueurs dans l’ensemble de données par maximum de vraisemblance. Un agent qui apprend à jouer par renforcement (en jouant contre lui-même) a aussi été entraîné. L’article se conclut en faisant une analyse de ces modèles et en comparant la performance des agents contre des agents utilisant des règles complexes.This thesis presents an article on an agent which can play the "No-Press" version (without messages) of the Diplomacy board game. Diplomacy is a 7-player negotiation game where each player tries to conquer the majority of the supply centers in Europe at the beginning of the 20th century. The article first presents a novel dataset of more than 150 000 human games. This dataset was compiled following the signing of a partnership with an external site. The games, which were played on this platform, were all converted into a new standardized format and then replayed to ensure their quality. The article also presents a game engine, with a web interface, allowing humans to play against the models that have been trained. Moreover, the article presents a supervised learning model where an agent learns to reproduce the behavior of all players in the dataset by maximum likelihood. An agent that learns by reinforcement (by playing games against itself) has also been trained. The article concludes by doing an analysis of these models and comparing their performance against complex rule-based agents

Dépôt Institutionnel Numérique