Search CORE

108,257 research outputs found

Learning User Preferences via Reinforcement Learning with Spatial Interface Valuing

Author: Alonso Jr Miguel
Publication venue
Publication date: 02/02/2019
Field of study

Interactive Machine Learning is concerned with creating systems that operate in environments alongside humans to achieve a task. A typical use is to extend or amplify the capabilities of a human in cognitive or physical ways, requiring the machine to adapt to the users' intentions and preferences. Often, this takes the form of a human operator providing some type of feedback to the user, which can be explicit feedback, implicit feedback, or a combination of both. Explicit feedback, such as through a mouse click, carries a high cognitive load. The focus of this study is to extend the current state of the art in interactive machine learning by demonstrating that agents can learn a human user's behavior and adapt to preferences with a reduced amount of explicit human feedback in a mixed feedback setting. The learning agent perceives a value of its own behavior from hand gestures given via a spatial interface. This feedback mechanism is termed Spatial Interface Valuing. This method is evaluated experimentally in a simulated environment for a grasping task using a robotic arm with variable grip settings. Preliminary results indicate that learning agents using spatial interface valuing can learn a value function mapping spatial gestures to expected future rewards much more quickly as compared to those same agents just receiving explicit feedback, demonstrating that an agent perceiving feedback from a human user via a spatial interface can serve as an effective complement to existing approaches.Comment: Submitted to HCI International 2019 Parallel Session on Spatial Interaction for Universal Acces

arXiv.org e-Print Archive

Deep Reinforcement Learning for General Video Game AI

Author: Bontrager Philip
Liu Jialin
Perez-Liebana Diego
Togelius Julian
Torrado Ruben Rodriguez
Publication venue
Publication date: 06/06/2018
Field of study

The General Video Game AI (GVGAI) competition and its associated software framework provides a way of benchmarking AI algorithms on a large number of games written in a domain-specific description language. While the competition has seen plenty of interest, it has so far focused on online planning, providing a forward model that allows the use of algorithms such as Monte Carlo Tree Search. In this paper, we describe how we interface GVGAI to the OpenAI Gym environment, a widely used way of connecting agents to reinforcement learning problems. Using this interface, we characterize how widely used implementations of several deep reinforcement learning algorithms fare on a number of GVGAI games. We further analyze the results to provide a first indication of the relative difficulty of these games relative to each other, and relative to those in the Arcade Learning Environment under similar conditions.Comment: 8 pages, 4 figures, Accepted at the conference on Computational Intelligence and Games 2018 IEE

arXiv.org e-Print Archive

StarCraft II: A New Challenge for Reinforcement Learning

Author: Agapiou John
Bartunov Sergey
Brunasso Anthony
Calderone Kevin
Ekermo Anders
Ewalds Timo
Gaffney Stephen
Georgiev Petko
Keet Paul
Küttler Heinrich
Lawrence David
Lillicrap Timothy
Makhzani Alireza
Petersen Stig
Quan John
Repp Jacob
Schaul Tom
Schrittwieser Julian
Silver David
Simonyan Karen
Tsing Rodney
van Hasselt Hado
Vezhnevets Alexander Sasha
Vinyals Oriol
Yeo Michelle
Publication venue
Publication date: 16/08/2017
Field of study

This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the StarCraft II game. This domain poses a new grand challenge for reinforcement learning, representing a more difficult class of problems than considered in most prior work. It is a multi-agent problem with multiple players interacting; there is imperfect information due to a partially observed map; it has a large action space involving the selection and control of hundreds of units; it has a large state space that must be observed solely from raw input feature planes; and it has delayed credit assignment requiring long-term strategies over thousands of steps. We describe the observation, action, and reward specification for the StarCraft II domain and provide an open source Python-based interface for communicating with the game engine. In addition to the main game maps, we provide a suite of mini-games focusing on different elements of StarCraft II gameplay. For the main game maps, we also provide an accompanying dataset of game replay data from human expert players. We give initial baseline results for neural networks trained from this data to predict game outcomes and player actions. Finally, we present initial baseline results for canonical deep reinforcement learning agents applied to the StarCraft II domain. On the mini-games, these agents learn to achieve a level of play that is comparable to a novice player. However, when trained on the main game, these agents are unable to make significant progress. Thus, SC2LE offers a new and challenging environment for exploring deep reinforcement learning algorithms and architectures.Comment: Collaboration between DeepMind & Blizzard. 20 pages, 9 figures, 2 table

arXiv.org e-Print Archive

CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic Scenario

Author: Ding Yaoyao
Feng Siyuan
Jin Haiming
Li Zhenhui
Liu Chang
Yu Yong
Zhang Huichu
Zhang Weinan
Zhou Zihan
Zhu Yichen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/05/2019
Field of study

Traffic signal control is an emerging application scenario for reinforcement learning. Besides being as an important problem that affects people's daily life in commuting, traffic signal control poses its unique challenges for reinforcement learning in terms of adapting to dynamic traffic environment and coordinating thousands of agents including vehicles and pedestrians. A key factor in the success of modern reinforcement learning relies on a good simulator to generate a large number of data samples for learning. The most commonly used open-source traffic simulator SUMO is, however, not scalable to large road network and large traffic flow, which hinders the study of reinforcement learning on traffic scenarios. This motivates us to create a new traffic simulator CityFlow with fundamentally optimized data structures and efficient algorithms. CityFlow can support flexible definitions for road network and traffic flow based on synthetic and real-world data. It also provides user-friendly interface for reinforcement learning. Most importantly, CityFlow is more than twenty times faster than SUMO and is capable of supporting city-wide traffic simulation with an interactive render for monitoring. Besides traffic signal control, CityFlow could serve as the base for other transportation studies and can create new possibilities to test machine learning methods in the intelligent transportation domain.Comment: WWW 2019 Demo Pape

arXiv.org e-Print Archive

Agent based cooperative learning System (saca)

Author: Yacine Lafifi
Yacine Lafifi
Publication venue: Pedagogical Faculty of University of Ostrava
Publication date: 01/01/2001
Field of study

Περιέχει το πλήρες κείμενοOver the last several years, there has been significant progress in techniques for creating autonomous agent, i.e. systems that are capable of performing tasks and achieving goals in complex, dynamic environments. These agents are able to interact with other agents and collaborate with them to achieve common goals. A promissing application area for agents is education and training. In this paper we present the architecture and the main features of SACA (Système d Apprentissage Coopératif basé sur l Agent): a cooperative learning system based on agents. In our system, agents are modelled in terms of their capabilities and their mental state, which refers to an explicit representation of an agent s commitments and beliefs. SACA is composed of four agents: Tutor Agent which follows the formation session of each student, a Domain agent for representing the matter to be teached, a student agent for helping students in the learning task and finally the author agent for monitoring the matter and following the progress of the student. The Domain agent organises the matter to be taught on educational objectives based on some prerequisite relations. The agent tutor supervises the formation of learners and provides them with cooperation opportunities. A student interface holds informations about the student (initial level of knowledge, final objective, psychological attributes, ). It is used to adapt the teaching to the student. Another interface, the author interface, is used for following the formation sessions by the person who is responsible of students s formation

LEKYTHOS

How intelligence can change the course of evolution

Author: Aguilar Leonel
Bennati Stefano
Helbing Dirk
Publication venue
Publication date: 23/12/2017
Field of study

The effect of phenotypic plasticity on evolution, the so-called Baldwin effect, has been studied extensively for more than 100 years. Plasticity is known to influence the speed of evolution towards a specific genetic configuration, but whether it also influences what that genetic configuration is, is still an open question. This question is investigated, in an environment where the distribution of resources follows seasonal cycles, both analytically and experimentally by means of an agent-based model of a foraging task. Individuals can either specialize to foraging only one specific resource type or generalize to foraging all resource types at a low success rate. It is found that the introduction of learning, one instance of phenotypic plasticity, changes what genetic configuration evolves. Specifically, the genome of learning agents evolves a predisposition to adapt quickly to changes in the resource distribution, under the same conditions for which non-learners would evolve a predisposition to maximize the foraging efficiency for a specific resource type. This paper expands the literature at the interface between Biology and Machine Learning by identifying the Baldwin effects in cyclically-changing environments and demonstrating that learning can change the outcome of evolution

arXiv.org e-Print Archive

Development of user interface agent in multimedia courseware / Mohamad Shukri Abdurrahman Zuhair

Author: Abdurrahman Zuhair Mohamad Shukri
Publication venue
Publication date: 01/01/2005
Field of study

Recent years have witnessed the birth of a new paradigm for learning environments: animated interface agents. These lifelike autonomous characters inhabit learning environments with students to create rich, face-to-face learning interactions. This opens up exciting new possibilities; for example, agents can demonstrate complex tasks, employ gesture to focus student's attention on the most significant aspect of the task at hand and express emotional responses to the tutorial situation. Animated interface agents offer great promise for broadening the bandwidth of tutorial communication and increasing learning environment's ability to engage and motivate students. This project develops an animated pedagogical interface agent for multimedia courseware entitles KOMSAS. The introduction of a pedagogical interface agent to KOMSAS courseware enables it to provide higher motivational support to the students and enhances their quality of learning

Towards Teachable Conversational Agents

Author: Chhibber Nalin
Law Edith
Publication venue
Publication date: 20/02/2021
Field of study

The traditional process of building interactive machine learning systems can be viewed as a teacher-learner interaction scenario where the machine-learners are trained by one or more human-teachers. In this work, we explore the idea of using a conversational interface to investigate the interaction between human-teachers and interactive machine-learners. Specifically, we examine whether teachable AI agents can reliably learn from human-teachers through conversational interactions, and how this learning compare with traditional supervised learning algorithms. Results validate the concept of teachable conversational agents and highlight the factors relevant for the development of machine learning systems that intend to learn from conversational interactions.Comment: 9 Pages, 3 Figures, 2 Tables, Presented at NeurIPS 2020: Human in the Loop Dialogue Systems Worksho

arXiv.org e-Print Archive

The AI Arena: A Framework for Distributed Multi-Agent Reinforcement Learning

Author: Llorens Ashley J.
Rivera Corban G.
Staley Edward W.
Publication venue
Publication date: 09/03/2021
Field of study

Advances in reinforcement learning (RL) have resulted in recent breakthroughs in the application of artificial intelligence (AI) across many different domains. An emerging landscape of development environments is making powerful RL techniques more accessible for a growing community of researchers. However, most existing frameworks do not directly address the problem of learning in complex operating environments, such as dense urban settings or defense-related scenarios, that incorporate distributed, heterogeneous teams of agents. To help enable AI research for this important class of applications, we introduce the AI Arena: a scalable framework with flexible abstractions for distributed multi-agent reinforcement learning. The AI Arena extends the OpenAI Gym interface to allow greater flexibility in learning control policies across multiple agents with heterogeneous learning strategies and localized views of the environment. To illustrate the utility of our framework, we present experimental results that demonstrate performance gains due to a distributed multi-agent learning approach over commonly-used RL techniques in several different learning environments

arXiv.org e-Print Archive

marl-jax: Multi-Agent Reinforcement Leaning Framework

Author: Kumar Pawan
Mahajan Anuj
Mehta Kinal
Publication venue
Publication date: 25/07/2023
Field of study

Recent advances in Reinforcement Learning (RL) have led to many exciting applications. These advancements have been driven by improvements in both algorithms and engineering, which have resulted in faster training of RL agents. We present marl-jax, a multi-agent reinforcement learning software package for training and evaluating social generalization of the agents. The package is designed for training a population of agents in multi-agent environments and evaluating their ability to generalize to diverse background agents. It is built on top of DeepMind's JAX ecosystem~\cite{deepmind2020jax} and leverages the RL ecosystem developed by DeepMind. Our framework marl-jax is capable of working in cooperative and competitive, simultaneous-acting environments with multiple agents. The package offers an intuitive and user-friendly command-line interface for training a population and evaluating its generalization capabilities. In conclusion, marl-jax provides a valuable resource for researchers interested in exploring social generalization in the context of MARL. The open-source code for marl-jax is available at: \href{https://github.com/kinalmehta/marl-jax}{https://github.com/kinalmehta/marl-jax}Comment: Accepted at ECML-PKDD 2023 Demo Trac

arXiv.org e-Print Archive