19 research outputs found
Online quantum mixture regression for trajectory learning by demonstration
In this work, we present the online Quantum Mixture Model (oQMM), which combines the merits of quantum mechanics and stochastic optimization. More specifically it allows for quantum effects on the mixture states, which in turn become a superposition of conventional mixture states. We propose an efficient stochastic online learning algorithm based on the online Expectation Maximization (EM), as well as a generation and decay scheme for model components. Our method is suitable for complex robotic applications, where data is abundant or where we wish to iteratively refine our model and conduct predictions during the course of learning. With a synthetic example, we show that the algorithm can achieve higher numerical stability. We also empirically demonstrate the efficacy of our method in well-known regression benchmark datasets. Under a trajectory Learning by Demonstration setting we employ a multi-shot learning application in joint angle space, where we observe higher quality of learning and reproduction. We compare against popular and well-established methods, widely adopted across the robotics community
One-Shot Learning of Manipulation Skills with Online Dynamics Adaptation and Neural Network Priors
One of the key challenges in applying reinforcement learning to complex
robotic control tasks is the need to gather large amounts of experience in
order to find an effective policy for the task at hand. Model-based
reinforcement learning can achieve good sample efficiency, but requires the
ability to learn a model of the dynamics that is good enough to learn an
effective policy. In this work, we develop a model-based reinforcement learning
algorithm that combines prior knowledge from previous tasks with online
adaptation of the dynamics model. These two ingredients enable highly
sample-efficient learning even in regimes where estimating the true dynamics is
very difficult, since the online model adaptation allows the method to locally
compensate for unmodeled variation in the dynamics. We encode the prior
experience into a neural network dynamics model, adapt it online by
progressively refitting a local linear model of the dynamics, and use model
predictive control to plan under these dynamics. Our experimental results show
that this approach can be used to solve a variety of complex robotic
manipulation tasks in just a single attempt, using prior data from other
manipulation behaviors
Improving human-robot interactivity for tele-operated industrial and service robot applications
In industrial robotics applications, teach pendant has been widely used by human operators to pre-define action trajectories for robot manipulators to execute as primitives. This hard-coding approach is only good for low-mix-highvolume jobs with sparse trajectory way-points. In this paper, we present a novel industrial robotic system designed for applications where human-robot interaction is key for efficient execution of actions such as high-mix-low-volume jobs. The proposed system comprises a robot manipulator that controls a tool (such as a soldering iron) to interact with the required workpiece, a networking server for remote tele-operation, and an integrated user interface that allows the human operator to better perceive the remote operation and to execute actions with greater ease. A user study is conducted to understand the merits of the proposed system. Results indicate that human can operate the system with ease and complete tasks more quickly and that the system can improve application efficiency
A system of intelligent algorithms for a module of onboard equipment of mobile vehicles
The area of intelligent robotics is moving from the single robot control problem to that of controlling multiple robots operating together and even collaborating in dynamic and unstructured intelligent environments. In such conditions, an intelligent robot control system is only part of general intelligent system. In this paper, we consider a model of such system. © 2013 Anna Gorbenko
Towards Anthropomorphic Robot Thereminist
Theremin is an electronic musical instrument considered to be the most difficult to play which requires the players hands to have high precision and stability as any position change within proximity of the instruments antennae can make a difference to the pitch or volume. In a different direction to previous developments of Theremin playing robots, we propose a Humanoid Thereminist System that goes beyond using only one degree of freedom which will open up the possibility for robot to acquire more complex skills, such as aerial fingering and include musical expressions in playing the Theremin. The proposed system consists of two phases, namely calibration phase and playing phase which can be executed independently. During the playing phase, the System takes input from a MIDI file and performs path planning using a combination of minimum energy strategy in joint space and feedback error correction for next playing note. Three experiments have been conducted to evaluate the developed system quantitatively and qualitatively by playing a selection of music files. The experiments have demonstrated that the proposed system can effectively utilise multiple degrees of freedoms while maintaining minimum pitch error margins
Neural Task Programming: Learning to Generalize Across Hierarchical Tasks
In this work, we propose a novel robot learning framework called Neural Task
Programming (NTP), which bridges the idea of few-shot learning from
demonstration and neural program induction. NTP takes as input a task
specification (e.g., video demonstration of a task) and recursively decomposes
it into finer sub-task specifications. These specifications are fed to a
hierarchical neural program, where bottom-level programs are callable
subroutines that interact with the environment. We validate our method in three
robot manipulation tasks. NTP achieves strong generalization across sequential
tasks that exhibit hierarchal and compositional structures. The experimental
results show that NTP learns to generalize well to- wards unseen tasks with
increasing lengths, variable topologies, and changing objectives.Comment: ICRA 201