8,288 research outputs found
Learning to Teach Reinforcement Learning Agents
In this article we study the transfer learning model of action advice under a
budget. We focus on reinforcement learning teachers providing action advice to
heterogeneous students playing the game of Pac-Man under a limited advice
budget. First, we examine several critical factors affecting advice quality in
this setting, such as the average performance of the teacher, its variance and
the importance of reward discounting in advising. The experiments show the
non-trivial importance of the coefficient of variation (CV) as a statistic for
choosing policies that generate advice. The CV statistic relates variance to
the corresponding mean. Second, the article studies policy learning for
distributing advice under a budget. Whereas most methods in the relevant
literature rely on heuristics for advice distribution we formulate the problem
as a learning one and propose a novel RL algorithm capable of learning when to
advise, adapting to the student and the task at hand. Furthermore, we argue
that learning to advise under a budget is an instance of a more generic
learning problem: Constrained Exploitation Reinforcement Learning
A Survey and Critique of Multiagent Deep Reinforcement Learning
Deep reinforcement learning (RL) has achieved outstanding results in recent
years. This has led to a dramatic increase in the number of applications and
methods. Recent works have explored learning beyond single-agent scenarios and
have considered multiagent learning (MAL) scenarios. Initial results report
successes in complex multiagent domains, although there are several challenges
to be addressed. The primary goal of this article is to provide a clear
overview of current multiagent deep reinforcement learning (MDRL) literature.
Additionally, we complement the overview with a broader analysis: (i) we
revisit previous key components, originally presented in MAL and RL, and
highlight how they have been adapted to multiagent deep reinforcement learning
settings. (ii) We provide general guidelines to new practitioners in the area:
describing lessons learned from MDRL works, pointing to recent benchmarks, and
outlining open avenues of research. (iii) We take a more critical tone raising
practical challenges of MDRL (e.g., implementation and computational demands).
We expect this article will help unify and motivate future research to take
advantage of the abundant literature that exists (e.g., RL and MAL) in a joint
effort to promote fruitful research in the multiagent community.Comment: Under review since Oct 2018. Earlier versions of this work had the
title: "Is multiagent deep reinforcement learning the answer or the question?
A brief survey
Microbial diversity in the thermal springs within Hot Springs National Park
The thermal water systems of Hot Springs National Park (HSNP) in Hot Springs, Arkansas exist in relative isolation from other North American thermal systems. The HSNP waters could therefore serve as a unique center of thermophilic microbial biodiversity. However, these springs remain largely unexplored using culture-independent next generation sequencing techniques to classify species of thermophilic organisms. Additionally, HSNP has been the focus of anthropogenic development, capping and diverting the springs for use in recreational bathhouse facilities. Human modification of these springs may have impacted the structure of these bacterial communities compared to springs left in a relative natural state. The goal of this study was to compare the community structure in two capped springs and two uncapped springs in HSNP, as well as broadly survey the microbial diversity of the springs. We used Illumina 16S rRNA sequencing of water samples from each spring, the QIIME workflow for sequence analysis, and generated measures of genera and phyla richness, diversity, and evenness. In total, over 700 genera were detected and most individual samples had more than 100 genera. There were also several uncharacterized sequences that could not be placed in known taxa, indicating the sampled springs contain undescribed bacteria. There was great variation both between sites and within samples, so no significant differences were detected in community structure between sites. Our results suggest that these springs, regardless of their human modification, contain a considerable amount of biodiversity, some of it potentially unique to the study site
- …