5,319 research outputs found
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
Learning Human-Robot Collaboration Insights through the Integration of Muscle Activity in Interaction Motion Models
Recent progress in human-robot collaboration makes fast and fluid
interactions possible, even when human observations are partial and occluded.
Methods like Interaction Probabilistic Movement Primitives (ProMP) model human
trajectories through motion capture systems. However, such representation does
not properly model tasks where similar motions handle different objects. Under
current approaches, a robot would not adapt its pose and dynamics for proper
handling. We integrate the use of Electromyography (EMG) into the Interaction
ProMP framework and utilize muscular signals to augment the human observation
representation. The contribution of our paper is increased task discernment
when trajectories are similar but tools are different and require the robot to
adjust its pose for proper handling. Interaction ProMPs are used with an
augmented vector that integrates muscle activity. Augmented time-normalized
trajectories are used in training to learn correlation parameters and robot
motions are predicted by finding the best weight combination and temporal
scaling for a task. Collaborative single task scenarios with similar motions
but different objects were used and compared. For one experiment only joint
angles were recorded, for the other EMG signals were additionally integrated.
Task recognition was computed for both tasks. Observation state vectors with
augmented EMG signals were able to completely identify differences across
tasks, while the baseline method failed every time. Integrating EMG signals
into collaborative tasks significantly increases the ability of the system to
recognize nuances in the tasks that are otherwise imperceptible, up to 74.6% in
our studies. Furthermore, the integration of EMG signals for collaboration also
opens the door to a wide class of human-robot physical interactions based on
haptic communication that has been largely unexploited in the field.Comment: 7 pages, 2 figures, 2 tables. As submitted to Humanoids 201
MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot Interaction
Modeling interaction dynamics to generate robot trajectories that enable a
robot to adapt and react to a human's actions and intentions is critical for
efficient and effective collaborative Human-Robot Interactions (HRI). Learning
from Demonstration (LfD) methods from Human-Human Interactions (HHI) have shown
promising results, especially when coupled with representation learning
techniques. However, such methods for learning HRI either do not scale well to
high dimensional data or cannot accurately adapt to changing via-poses of the
interacting partner. We propose Multimodal Interactive Latent Dynamics (MILD),
a method that couples deep representation learning and probabilistic machine
learning to address the problem of two-party physical HRIs. We learn the
interaction dynamics from demonstrations, using Hidden Semi-Markov Models
(HSMMs) to model the joint distribution of the interacting agents in the latent
space of a Variational Autoencoder (VAE). Our experimental evaluations for
learning HRI from HHI demonstrations show that MILD effectively captures the
multimodality in the latent representations of HRI tasks, allowing us to decode
the varying dynamics occurring in such tasks. Compared to related work, MILD
generates more accurate trajectories for the controlled agent (robot) when
conditioned on the observed agent's (human) trajectory. Notably, MILD can learn
directly from camera-based pose estimations to generate trajectories, which we
then map to a humanoid robot without the need for any additional training.Comment: Accepted at the IEEE-RAS International Conference on Humanoid Robots
(Humanoids) 202
SERKET: An Architecture for Connecting Stochastic Models to Realize a Large-Scale Cognitive Model
To realize human-like robot intelligence, a large-scale cognitive
architecture is required for robots to understand the environment through a
variety of sensors with which they are equipped. In this paper, we propose a
novel framework named Serket that enables the construction of a large-scale
generative model and its inference easily by connecting sub-modules to allow
the robots to acquire various capabilities through interaction with their
environments and others. We consider that large-scale cognitive models can be
constructed by connecting smaller fundamental models hierarchically while
maintaining their programmatic independence. Moreover, connected modules are
dependent on each other, and parameters are required to be optimized as a
whole. Conventionally, the equations for parameter estimation have to be
derived and implemented depending on the models. However, it becomes harder to
derive and implement those of a larger scale model. To solve these problems, in
this paper, we propose a method for parameter estimation by communicating the
minimal parameters between various modules while maintaining their programmatic
independence. Therefore, Serket makes it easy to construct large-scale models
and estimate their parameters via the connection of modules. Experimental
results demonstrated that the model can be constructed by connecting modules,
the parameters can be optimized as a whole, and they are comparable with the
original models that we have proposed
Show, Attend and Interact: Perceivable Human-Robot Social Interaction through Neural Attention Q-Network
For a safe, natural and effective human-robot social interaction, it is
essential to develop a system that allows a robot to demonstrate the
perceivable responsive behaviors to complex human behaviors. We introduce the
Multimodal Deep Attention Recurrent Q-Network using which the robot exhibits
human-like social interaction skills after 14 days of interacting with people
in an uncontrolled real world. Each and every day during the 14 days, the
system gathered robot interaction experiences with people through a
hit-and-trial method and then trained the MDARQN on these experiences using
end-to-end reinforcement learning approach. The results of interaction based
learning indicate that the robot has learned to respond to complex human
behaviors in a perceivable and socially acceptable manner.Comment: 7 pages, 5 figures, accepted by IEEE-RAS ICRA'1
- …