42,076 research outputs found
CompILE: Compositional Imitation Learning and Execution
We introduce Compositional Imitation Learning and Execution (CompILE): a
framework for learning reusable, variable-length segments of
hierarchically-structured behavior from demonstration data. CompILE uses a
novel unsupervised, fully-differentiable sequence segmentation module to learn
latent encodings of sequential data that can be re-composed and executed to
perform new tasks. Once trained, our model generalizes to sequences of longer
length and from environment instances not seen during training. We evaluate
CompILE in a challenging 2D multi-task environment and a continuous control
task, and show that it can find correct task boundaries and event encodings in
an unsupervised manner. Latent codes and associated behavior policies
discovered by CompILE can be used by a hierarchical agent, where the high-level
policy selects actions in the latent code space, and the low-level,
task-specific policies are simply the learned decoders. We found that our
CompILE-based agent could learn given only sparse rewards, where agents without
task-specific policies struggle.Comment: ICML (2019
How hard is it to cross the room? -- Training (Recurrent) Neural Networks to steer a UAV
This work explores the feasibility of steering a drone with a (recurrent)
neural network, based on input from a forward looking camera, in the context of
a high-level navigation task. We set up a generic framework for training a
network to perform navigation tasks based on imitation learning. It can be
applied to both aerial and land vehicles. As a proof of concept we apply it to
a UAV (Unmanned Aerial Vehicle) in a simulated environment, learning to cross a
room containing a number of obstacles. So far only feedforward neural networks
(FNNs) have been used to train UAV control. To cope with more complex tasks, we
propose the use of recurrent neural networks (RNN) instead and successfully
train an LSTM (Long-Short Term Memory) network for controlling UAVs. Vision
based control is a sequential prediction problem, known for its highly
correlated input data. The correlation makes training a network hard,
especially an RNN. To overcome this issue, we investigate an alternative
sampling method during training, namely window-wise truncated backpropagation
through time (WW-TBPTT). Further, end-to-end training requires a lot of data
which often is not available. Therefore, we compare the performance of
retraining only the Fully Connected (FC) and LSTM control layers with networks
which are trained end-to-end. Performing the relatively simple task of crossing
a room already reveals important guidelines and good practices for training
neural control networks. Different visualizations help to explain the behavior
learned.Comment: 12 pages, 30 figure
Recovering from External Disturbances in Online Manipulation through State-Dependent Revertive Recovery Policies
Robots are increasingly entering uncertain and unstructured environments.
Within these, robots are bound to face unexpected external disturbances like
accidental human or tool collisions. Robots must develop the capacity to
respond to unexpected events. That is not only identifying the sudden anomaly,
but also deciding how to handle it. In this work, we contribute a recovery
policy that allows a robot to recovery from various anomalous scenarios across
different tasks and conditions in a consistent and robust fashion. The system
organizes tasks as a sequence of nodes composed of internal modules such as
motion generation and introspection. When an introspection module flags an
anomaly, the recovery strategy is triggered and reverts the task execution by
selecting a target node as a function of a state dependency chart. The new
skill allows the robot to overcome the effects of the external disturbance and
conclude the task. Our system recovers from accidental human and tool
collisions in a number of tasks. Of particular importance is the fact that we
test the robustness of the recovery system by triggering anomalies at each node
in the task graph showing robust recovery everywhere in the task. We also
trigger multiple and repeated anomalies at each of the nodes of the task
showing that the recovery system can consistently recover anywhere in the
presence of strong and pervasive anomalous conditions. Robust recovery systems
will be key enablers for long-term autonomy in robot systems. Supplemental info
including code, data, graphs, and result analysis can be found at [1].Comment: 8 pages, 8 figures, 1 tabl
An Overview on Application of Machine Learning Techniques in Optical Networks
Today's telecommunication networks have become sources of enormous amounts of
widely heterogeneous data. This information can be retrieved from network
traffic traces, network alarms, signal quality indicators, users' behavioral
data, etc. Advanced mathematical tools are required to extract meaningful
information from these data and take decisions pertaining to the proper
functioning of the networks from the network-generated data. Among these
mathematical tools, Machine Learning (ML) is regarded as one of the most
promising methodological approaches to perform network-data analysis and enable
automated network self-configuration and fault management. The adoption of ML
techniques in the field of optical communication networks is motivated by the
unprecedented growth of network complexity faced by optical networks in the
last few years. Such complexity increase is due to the introduction of a huge
number of adjustable and interdependent system parameters (e.g., routing
configurations, modulation format, symbol rate, coding schemes, etc.) that are
enabled by the usage of coherent transmission/reception technologies, advanced
digital signal processing and compensation of nonlinear effects in optical
fiber propagation. In this paper we provide an overview of the application of
ML to optical communications and networking. We classify and survey relevant
literature dealing with the topic, and we also provide an introductory tutorial
on ML for researchers and practitioners interested in this field. Although a
good number of research papers have recently appeared, the application of ML to
optical networks is still in its infancy: to stimulate further work in this
area, we conclude the paper proposing new possible research directions
Robot pain: a speculative review of its functions
Given the scarce bibliography dealing explicitly with robot pain, this chapter has enriched its review with related research works about robot behaviours and capacities in which pain could play a role. It is shown that all such roles ¿ranging from punishment to intrinsic motivation and planning knowledge¿ can be formulated within the unified framework of reinforcement learning.Peer ReviewedPostprint (author's final draft
Maximizing Competency Education and Blended Learning: Insights from Experts
In May 2014, CompetencyWorks brought together twenty-three technical assistance providers to examine their catalytic role in implementing next generation learning models, share each other's knowledge and expertise about blended learning and competency education, and discuss next steps to move the field forward with a focus on equity and quality. Our strategy maintains that by building the knowledge and networks of technical assistance providers, these groups can play an even more catalytic role in advancing the field. The objective of the convening was to help educate and level set the understanding of competency education and its design elements, as well as to build knowledge about using blended learning modalities within competency-based environments. This paper attempts to draw together the wide-ranging conversations from the convening to provide background knowledge for educators to understand what it will take to transform from traditional to personalized, competency-based systems that take full advantage of blended learning
Autonomous Fault Detection in Self-Healing Systems using Restricted Boltzmann Machines
Autonomously detecting and recovering from faults is one approach for
reducing the operational complexity and costs associated with managing
computing environments. We present a novel methodology for autonomously
generating investigation leads that help identify systems faults, and extends
our previous work in this area by leveraging Restricted Boltzmann Machines
(RBMs) and contrastive divergence learning to analyse changes in historical
feature data. This allows us to heuristically identify the root cause of a
fault, and demonstrate an improvement to the state of the art by showing
feature data can be predicted heuristically beyond a single instance to include
entire sequences of information.Comment: Published and presented in the 11th IEEE International Conference and
Workshops on Engineering of Autonomic and Autonomous Systems (EASe 2014
- …