1,265 research outputs found
Spatio-temporal learning with the online finite and infinite echo-state Gaussian processes
Successful biological systems adapt to change. In this paper, we are principally concerned with adaptive systems that operate in environments where data arrives sequentially and is multivariate in nature, for example, sensory streams in robotic systems. We contribute two reservoir inspired methods: 1) the online echostate Gaussian process (OESGP) and 2) its infinite variant, the online infinite echostate Gaussian process (OIESGP) Both algorithms are iterative fixed-budget methods that learn from noisy time series. In particular, the OESGP combines the echo-state network with Bayesian online learning for Gaussian processes. Extending this to infinite reservoirs yields the OIESGP, which uses a novel recursive kernel with automatic relevance determination that enables spatial and temporal feature weighting. When fused with stochastic natural gradient descent, the kernel hyperparameters are iteratively adapted to better model the target system. Furthermore, insights into the underlying system can be gleamed from inspection of the resulting hyperparameters. Experiments on noisy benchmark problems (one-step prediction and system identification) demonstrate that our methods yield high accuracies relative to state-of-the-art methods, and standard kernels with sliding windows, particularly on problems with irrelevant dimensions. In addition, we describe two case studies in robotic learning-by-demonstration involving the Nao humanoid robot and the Assistive Robot Transport for Youngsters (ARTY) smart wheelchair
Imitation and Mirror Systems in Robots through Deep Modality Blending Networks
Learning to interact with the environment not only empowers the agent with
manipulation capability but also generates information to facilitate building
of action understanding and imitation capabilities. This seems to be a strategy
adopted by biological systems, in particular primates, as evidenced by the
existence of mirror neurons that seem to be involved in multi-modal action
understanding. How to benefit from the interaction experience of the robots to
enable understanding actions and goals of other agents is still a challenging
question. In this study, we propose a novel method, deep modality blending
networks (DMBN), that creates a common latent space from multi-modal experience
of a robot by blending multi-modal signals with a stochastic weighting
mechanism. We show for the first time that deep learning, when combined with a
novel modality blending scheme, can facilitate action recognition and produce
structures to sustain anatomical and effect-based imitation capabilities. Our
proposed system, can be conditioned on any desired sensory/motor value at any
time-step, and can generate a complete multi-modal trajectory consistent with
the desired conditioning in parallel avoiding accumulation of prediction
errors. We further showed that given desired images from different
perspectives, i.e. images generated by the observation of other robots placed
on different sides of the table, our system could generate image and joint
angle sequences that correspond to either anatomical or effect based imitation
behavior. Overall, the proposed DMBN architecture not only serves as a
computational model for sustaining mirror neuron-like capabilities, but also
stands as a powerful machine learning architecture for high-dimensional
multi-modal temporal data with robust retrieval capabilities operating with
partial information in one or multiple modalities
Robot Learning from Human Demonstration: Interpretation, Adaptation, and Interaction
Robot Learning from Demonstration (LfD) is a research area that focuses on how robots can learn new skills by observing how people perform various activities. As humans, we have a remarkable ability to imitate other human’s behaviors and adapt to new situations. Endowing robots with these critical capabilities is a significant but very challenging problem considering the complexity and variation of human activities in highly dynamic environments.
This research focuses on how robots can learn new skills by interpreting human activities, adapting the learned skills to new situations, and naturally interacting with humans. This dissertation begins with a discussion of challenges in each of these three problems. A new unified representation approach is introduced to enable robots to simultaneously interpret the high-level semantic meanings and generalize the low-level trajectories of a broad range of human activities. An adaptive framework based on feature space decomposition is then presented for robots to not only reproduce skills, but also autonomously and efficiently adjust the learned skills to new environments that are significantly different from demonstrations. To achieve natural Human Robot Interaction (HRI), this dissertation presents a Recurrent Neural Network based deep perceptual control approach, which is capable of integrating multi-modal perception sequences with actions for robots to interact with humans in long-term tasks.
Overall, by combining the above approaches, an autonomous system is created for robots to acquire important skills that can be applied to human-centered applications. Finally, this dissertation concludes with a discussion of future directions that could accelerate the upcoming technological revolution of robot learning from human demonstration
Online Ensemble Learning of Sensorimotor Contingencies
Forward models play a key role in cognitive agents by providing predictions of the sensory consequences of motor commands, also known as sensorimotor contingencies (SMCs). In continuously evolving environments, the ability to anticipate is fundamental in distinguishing cognitive from reactive agents, and it is particularly relevant for autonomous robots, that must be able to adapt their models in an online manner. Online learning skills, high accuracy of the forward models and multiple-step-ahead predictions are needed to enhance the robots’ anticipation capabilities. We propose an online heterogeneous ensemble learning method for building accurate forward models of SMCs relating motor commands to effects in robots’ sensorimotor system, in particular considering proprioception and vision. Our method achieves up to 98% higher accuracy both in short and long term predictions, compared to single predictors and other online and offline homogeneous ensembles. This method is validated on two different humanoid robots, namely the iCub and the Baxter
Unsupervised human-to-robot motion retargeting via expressive latent space
This paper introduces a novel approach for human-to-robot motion retargeting,
enabling robots to mimic human motion with precision while preserving the
semantics of the motion. For that, we propose a deep learning method for direct
translation from human to robot motion. Our method does not require annotated
paired human-to-robot motion data, which reduces the effort when adopting new
robots. To this end, we first propose a cross-domain similarity metric to
compare the poses from different domains (i.e., human and robot). Then, our
method achieves the construction of a shared latent space via contrastive
learning and decodes latent representations to robot motion control commands.
The learned latent space exhibits expressiveness as it captures the motions
precisely and allows direct motion control in the latent space. We showcase how
to generate in-between motion through simple linear interpolation in the latent
space between two projected human poses. Additionally, we conducted a
comprehensive evaluation of robot control using diverse modality inputs, such
as texts, RGB videos, and key-poses, which enhances the ease of robot control
to users of all backgrounds. Finally, we compare our model with existing works
and quantitatively and qualitatively demonstrate the effectiveness of our
approach, enhancing natural human-robot communication and fostering trust in
integrating robots into daily life
- …