17,481 research outputs found
A Deep Hierarchical Approach to Lifelong Learning in Minecraft
We propose a lifelong learning system that has the ability to reuse and
transfer knowledge from one task to another while efficiently retaining the
previously learned knowledge-base. Knowledge is transferred by learning
reusable skills to solve tasks in Minecraft, a popular video game which is an
unsolved and high-dimensional lifelong learning problem. These reusable skills,
which we refer to as Deep Skill Networks, are then incorporated into our novel
Hierarchical Deep Reinforcement Learning Network (H-DRLN) architecture using
two techniques: (1) a deep skill array and (2) skill distillation, our novel
variation of policy distillation (Rusu et. al. 2015) for learning skills. Skill
distillation enables the HDRLN to efficiently retain knowledge and therefore
scale in lifelong learning, by accumulating knowledge and encapsulating
multiple reusable skills into a single distilled network. The H-DRLN exhibits
superior performance and lower learning sample complexity compared to the
regular Deep Q Network (Mnih et. al. 2015) in sub-domains of Minecraft
Progressive Label Distillation: Learning Input-Efficient Deep Neural Networks
Much of the focus in the area of knowledge distillation has been on
distilling knowledge from a larger teacher network to a smaller student
network. However, there has been little research on how the concept of
distillation can be leveraged to distill the knowledge encapsulated in the
training data itself into a reduced form. In this study, we explore the concept
of progressive label distillation, where we leverage a series of
teacher-student network pairs to progressively generate distilled training data
for learning deep neural networks with greatly reduced input dimensions. To
investigate the efficacy of the proposed progressive label distillation
approach, we experimented with learning a deep limited vocabulary speech
recognition network based on generated 500ms input utterances distilled
progressively from 1000ms source training data, and demonstrated a significant
increase in test accuracy of almost 78% compared to direct learning.Comment: 9 page
- …