12,061 research outputs found
Early Turn-taking Prediction with Spiking Neural Networks for Human Robot Collaboration
Turn-taking is essential to the structure of human teamwork. Humans are
typically aware of team members' intention to keep or relinquish their turn
before a turn switch, where the responsibility of working on a shared task is
shifted. Future co-robots are also expected to provide such competence. To that
end, this paper proposes the Cognitive Turn-taking Model (CTTM), which
leverages cognitive models (i.e., Spiking Neural Network) to achieve early
turn-taking prediction. The CTTM framework can process multimodal human
communication cues (both implicit and explicit) and predict human turn-taking
intentions in an early stage. The proposed framework is tested on a simulated
surgical procedure, where a robotic scrub nurse predicts the surgeon's
turn-taking intention. It was found that the proposed CTTM framework
outperforms the state-of-the-art turn-taking prediction algorithms by a large
margin. It also outperforms humans when presented with partial observations of
communication cues (i.e., less than 40% of full actions). This early prediction
capability enables robots to initiate turn-taking actions at an early stage,
which facilitates collaboration and increases overall efficiency.Comment: Submitted to IEEE International Conference on Robotics and Automation
(ICRA) 201
Explorations in engagement for humans and robots
This paper explores the concept of engagement, the process by which
individuals in an interaction start, maintain and end their perceived
connection to one another. The paper reports on one aspect of engagement among
human interactors--the effect of tracking faces during an interaction. It also
describes the architecture of a robot that can participate in conversational,
collaborative interactions with engagement gestures. Finally, the paper reports
on findings of experiments with human participants who interacted with a robot
when it either performed or did not perform engagement gestures. Results of the
human-robot studies indicate that people become engaged with robots: they
direct their attention to the robot more often in interactions where engagement
gestures are present, and they find interactions more appropriate when
engagement gestures are present than when they are not.Comment: 31 pages, 5 figures, 3 table
Analyzing Input and Output Representations for Speech-Driven Gesture Generation
This paper presents a novel framework for automatic speech-driven gesture
generation, applicable to human-agent interaction including both virtual agents
and robots. Specifically, we extend recent deep-learning-based, data-driven
methods for speech-driven gesture generation by incorporating representation
learning. Our model takes speech as input and produces gestures as output, in
the form of a sequence of 3D coordinates. Our approach consists of two steps.
First, we learn a lower-dimensional representation of human motion using a
denoising autoencoder neural network, consisting of a motion encoder MotionE
and a motion decoder MotionD. The learned representation preserves the most
important aspects of the human pose variation while removing less relevant
variation. Second, we train a novel encoder network SpeechE to map from speech
to a corresponding motion representation with reduced dimensionality. At test
time, the speech encoder and the motion decoder networks are combined: SpeechE
predicts motion representations based on a given speech signal and MotionD then
decodes these representations to produce motion sequences. We evaluate
different representation sizes in order to find the most effective
dimensionality for the representation. We also evaluate the effects of using
different speech features as input to the model. We find that mel-frequency
cepstral coefficients (MFCCs), alone or combined with prosodic features,
perform the best. The results of a subsequent user study confirm the benefits
of the representation learning.Comment: Accepted at IVA '19. Shorter version published at AAMAS '19. The code
is available at
https://github.com/GestureGeneration/Speech_driven_gesture_generation_with_autoencode
Flexible human-robot cooperation models for assisted shop-floor tasks
The Industry 4.0 paradigm emphasizes the crucial benefits that collaborative
robots, i.e., robots able to work alongside and together with humans, could
bring to the whole production process. In this context, an enabling technology
yet unreached is the design of flexible robots able to deal at all levels with
humans' intrinsic variability, which is not only a necessary element for a
comfortable working experience for the person but also a precious capability
for efficiently dealing with unexpected events. In this paper, a sensing,
representation, planning and control architecture for flexible human-robot
cooperation, referred to as FlexHRC, is proposed. FlexHRC relies on wearable
sensors for human action recognition, AND/OR graphs for the representation of
and reasoning upon cooperation models, and a Task Priority framework to
decouple action planning from robot motion planning and control.Comment: Submitted to Mechatronics (Elsevier
- …