Search CORE

42,076 research outputs found

ONLINE SUPERVISED LEARNING OF NON-UNDERSTANDING RECOVERY POLICIES

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

CompILE: Compositional Imitation Learning and Execution

Author: Battaglia Peter
Dai Hanjun
Grefenstette Edward
Kipf Thomas
Kohli Pushmeet
Li Yujia
Sanchez-Gonzalez Alvaro
Zambaldi Vinicius
Publication venue
Publication date: 01/01/2019
Field of study

We introduce Compositional Imitation Learning and Execution (CompILE): a framework for learning reusable, variable-length segments of hierarchically-structured behavior from demonstration data. CompILE uses a novel unsupervised, fully-differentiable sequence segmentation module to learn latent encodings of sequential data that can be re-composed and executed to perform new tasks. Once trained, our model generalizes to sequences of longer length and from environment instances not seen during training. We evaluate CompILE in a challenging 2D multi-task environment and a continuous control task, and show that it can find correct task boundaries and event encodings in an unsupervised manner. Latent codes and associated behavior policies discovered by CompILE can be used by a hierarchical agent, where the high-level policy selects actions in the latent code space, and the low-level, task-specific policies are simply the learned decoders. We found that our CompILE-based agent could learn given only sparse rewards, where agents without task-specific policies struggle.Comment: ICML (2019

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

How hard is it to cross the room? -- Training (Recurrent) Neural Networks to steer a UAV

Author: Kelchtermans Klaas
Tuytelaars Tinne
Publication venue
Publication date: 01/01/2017
Field of study

This work explores the feasibility of steering a drone with a (recurrent) neural network, based on input from a forward looking camera, in the context of a high-level navigation task. We set up a generic framework for training a network to perform navigation tasks based on imitation learning. It can be applied to both aerial and land vehicles. As a proof of concept we apply it to a UAV (Unmanned Aerial Vehicle) in a simulated environment, learning to cross a room containing a number of obstacles. So far only feedforward neural networks (FNNs) have been used to train UAV control. To cope with more complex tasks, we propose the use of recurrent neural networks (RNN) instead and successfully train an LSTM (Long-Short Term Memory) network for controlling UAVs. Vision based control is a sequential prediction problem, known for its highly correlated input data. The correlation makes training a network hard, especially an RNN. To overcome this issue, we investigate an alternative sampling method during training, namely window-wise truncated backpropagation through time (WW-TBPTT). Further, end-to-end training requires a lot of data which often is not available. Therefore, we compare the performance of retraining only the Fully Connected (FC) and LSTM control layers with networks which are trained end-to-end. Performing the relatively simple task of crossing a room already reveals important guidelines and good practices for training neural control networks. Different visualizations help to explain the behavior learned.Comment: 12 pages, 30 figure

arXiv.org e-Print Archive

Lirias

Recovering from External Disturbances in Online Manipulation through State-Dependent Revertive Recovery Policies

Author: Duan Shuangda
Guan Yisheng
Lin Hongbin
Luo Shuangqi
Rojas Juan
Wu Hongmin
Publication venue
Publication date: 02/04/2018
Field of study

Robots are increasingly entering uncertain and unstructured environments. Within these, robots are bound to face unexpected external disturbances like accidental human or tool collisions. Robots must develop the capacity to respond to unexpected events. That is not only identifying the sudden anomaly, but also deciding how to handle it. In this work, we contribute a recovery policy that allows a robot to recovery from various anomalous scenarios across different tasks and conditions in a consistent and robust fashion. The system organizes tasks as a sequence of nodes composed of internal modules such as motion generation and introspection. When an introspection module flags an anomaly, the recovery strategy is triggered and reverts the task execution by selecting a target node as a function of a state dependency chart. The new skill allows the robot to overcome the effects of the external disturbance and conclude the task. Our system recovers from accidental human and tool collisions in a number of tasks. Of particular importance is the fact that we test the robustness of the recovery system by triggering anomalies at each node in the task graph showing robust recovery everywhere in the task. We also trigger multiple and repeated anomalies at each of the nodes of the task showing that the recovery system can consistently recover anywhere in the presence of strong and pervasive anomalous conditions. Robust recovery systems will be key enablers for long-term autonomy in robot systems. Supplemental info including code, data, graphs, and result analysis can be found at [1].Comment: 8 pages, 8 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

An Overview on Application of Machine Learning Techniques in Optical Networks

Author: Macaluso Irene
Musumeci Francesco
Nag Avishek
Rottondi Cristina
Ruffini Marco
Tornatore Massimo
Zibar Darko
Publication venue
Publication date: 01/12/2018
Field of study

Today's telecommunication networks have become sources of enormous amounts of widely heterogeneous data. This information can be retrieved from network traffic traces, network alarms, signal quality indicators, users' behavioral data, etc. Advanced mathematical tools are required to extract meaningful information from these data and take decisions pertaining to the proper functioning of the networks from the network-generated data. Among these mathematical tools, Machine Learning (ML) is regarded as one of the most promising methodological approaches to perform network-data analysis and enable automated network self-configuration and fault management. The adoption of ML techniques in the field of optical communication networks is motivated by the unprecedented growth of network complexity faced by optical networks in the last few years. Such complexity increase is due to the introduction of a huge number of adjustable and interdependent system parameters (e.g., routing configurations, modulation format, symbol rate, coding schemes, etc.) that are enabled by the usage of coherent transmission/reception technologies, advanced digital signal processing and compensation of nonlinear effects in optical fiber propagation. In this paper we provide an overview of the application of ML to optical communications and networking. We classify and survey relevant literature dealing with the topic, and we also provide an introductory tutorial on ML for researchers and practitioners interested in this field. Although a good number of research papers have recently appeared, the application of ML to optical networks is still in its infancy: to stimulate further work in this area, we conclude the paper proposing new possible research directions

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Online Research Database In Technology

Robot pain: a speculative review of its functions

Author: Torras Carme
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/01/2016
Field of study

Given the scarce bibliography dealing explicitly with robot pain, this chapter has enriched its review with related research works about robot behaviours and capacities in which pain could play a role. It is shown that all such roles ¿ranging from punishment to intrinsic motivation and planning knowledge¿ can be formulated within the unified framework of reinforcement learning.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Maximizing Competency Education and Blended Learning: Insights from Experts

Author: Chris Sturgis
Susan Patrick
Publication venue: International Association for K-12 Online Learning
Publication date: 03/03/2015
Field of study

In May 2014, CompetencyWorks brought together twenty-three technical assistance providers to examine their catalytic role in implementing next generation learning models, share each other's knowledge and expertise about blended learning and competency education, and discuss next steps to move the field forward with a focus on equity and quality. Our strategy maintains that by building the knowledge and networks of technical assistance providers, these groups can play an even more catalytic role in advancing the field. The objective of the convening was to help educate and level set the understanding of competency education and its design elements, as well as to build knowledge about using blended learning modalities within competency-based environments. This paper attempts to draw together the wide-ranging conversations from the convening to provide background knowledge for educators to understand what it will take to transform from traditional to personalized, competency-based systems that take full advantage of blended learning

IssueLab

Autonomous Fault Detection in Self-Healing Systems using Restricted Boltzmann Machines

Author: Barker Adam
Dobson Simon
Schneider Chris
Publication venue
Publication date: 07/01/2015
Field of study

Autonomously detecting and recovering from faults is one approach for reducing the operational complexity and costs associated with managing computing environments. We present a novel methodology for autonomously generating investigation leads that help identify systems faults, and extends our previous work in this area by leveraging Restricted Boltzmann Machines (RBMs) and contrastive divergence learning to analyse changes in historical feature data. This allows us to heuristically identify the root cause of a fault, and demonstrate an improvement to the state of the art by showing feature data can be predicted heuristically beyond a single instance to include entire sequences of information.Comment: Published and presented in the 11th IEEE International Conference and Workshops on Engineering of Autonomic and Autonomous Systems (EASe 2014

arXiv.org e-Print Archive

CiteSeerX

St Andrews Research Repository