5,631 research outputs found
Distral: Robust Multitask Reinforcement Learning
Most deep reinforcement learning algorithms are data inefficient in complex
and rich environments, limiting their applicability to many scenarios. One
direction for improving data efficiency is multitask learning with shared
neural network parameters, where efficiency may be improved through transfer
across related tasks. In practice, however, this is not usually observed,
because gradients from different tasks can interfere negatively, making
learning unstable and sometimes even less data efficient. Another issue is the
different reward schemes between tasks, which can easily lead to one task
dominating the learning of a shared model. We propose a new approach for joint
training of multiple tasks, which we refer to as Distral (Distill & transfer
learning). Instead of sharing parameters between the different workers, we
propose to share a "distilled" policy that captures common behaviour across
tasks. Each worker is trained to solve its own task while constrained to stay
close to the shared policy, while the shared policy is trained by distillation
to be the centroid of all task policies. Both aspects of the learning process
are derived by optimizing a joint objective function. We show that our approach
supports efficient transfer on complex 3D environments, outperforming several
related methods. Moreover, the proposed learning process is more robust and
more stable---attributes that are critical in deep reinforcement learning
Contextual Motifs: Increasing the Utility of Motifs using Contextual Data
Motifs are a powerful tool for analyzing physiological waveform data.
Standard motif methods, however, ignore important contextual information (e.g.,
what the patient was doing at the time the data were collected). We hypothesize
that these additional contextual data could increase the utility of motifs.
Thus, we propose an extension to motifs, contextual motifs, that incorporates
context. Recognizing that, oftentimes, context may be unobserved or
unavailable, we focus on methods to jointly infer motifs and context. Applied
to both simulated and real physiological data, our proposed approach improves
upon existing motif methods in terms of the discriminative utility of the
discovered motifs. In particular, we discovered contextual motifs in continuous
glucose monitor (CGM) data collected from patients with type 1 diabetes.
Compared to their contextless counterparts, these contextual motifs led to
better predictions of hypo- and hyperglycemic events. Our results suggest that
even when inferred, context is useful in both a long- and short-term prediction
horizon when processing and interpreting physiological waveform data.Comment: 10 pages, 7 figures, accepted for oral presentation at KDD '1
- …