2,435 research outputs found
Choreographic and Somatic Approaches for the Development of Expressive Robotic Systems
As robotic systems are moved out of factory work cells into human-facing
environments questions of choreography become central to their design,
placement, and application. With a human viewer or counterpart present, a
system will automatically be interpreted within context, style of movement, and
form factor by human beings as animate elements of their environment. The
interpretation by this human counterpart is critical to the success of the
system's integration: knobs on the system need to make sense to a human
counterpart; an artificial agent should have a way of notifying a human
counterpart of a change in system state, possibly through motion profiles; and
the motion of a human counterpart may have important contextual clues for task
completion. Thus, professional choreographers, dance practitioners, and
movement analysts are critical to research in robotics. They have design
methods for movement that align with human audience perception, can identify
simplified features of movement for human-robot interaction goals, and have
detailed knowledge of the capacity of human movement. This article provides
approaches employed by one research lab, specific impacts on technical and
artistic projects within, and principles that may guide future such work. The
background section reports on choreography, somatic perspectives,
improvisation, the Laban/Bartenieff Movement System, and robotics. From this
context methods including embodied exercises, writing prompts, and community
building activities have been developed to facilitate interdisciplinary
research. The results of this work is presented as an overview of a smattering
of projects in areas like high-level motion planning, software development for
rapid prototyping of movement, artistic output, and user studies that help
understand how people interpret movement. Finally, guiding principles for other
groups to adopt are posited.Comment: Under review at MDPI Arts Special Issue "The Machine as Artist (for
the 21st Century)"
http://www.mdpi.com/journal/arts/special_issues/Machine_Artis
The perception of emotion in artificial agents
Given recent technological developments in robotics, artificial intelligence and virtual reality, it is perhaps unsurprising that the arrival of emotionally expressive and reactive artificial agents is imminent. However, if such agents are to become integrated into our social milieu, it is imperative to establish an understanding of whether and how humans perceive emotion in artificial agents. In this review, we incorporate recent findings from social robotics, virtual reality, psychology, and neuroscience to examine how people recognize and respond to emotions displayed by artificial agents. First, we review how people perceive emotions expressed by an artificial agent, such as facial and bodily expressions and vocal tone. Second, we evaluate the similarities and differences in the consequences of perceived emotions in artificial compared to human agents. Besides accurately recognizing the emotional state of an artificial agent, it is critical to understand how humans respond to those emotions. Does interacting with an angry robot induce the same responses in people as interacting with an angry person? Similarly, does watching a robot rejoice when it wins a game elicit similar feelings of elation in the human observer? Here we provide an overview of the current state of emotion expression and perception in social robotics, as well as a clear articulation of the challenges and guiding principles to be addressed as we move ever closer to truly emotional artificial agents
Teaching Robots to Span the Space of Functional Expressive Motion
Our goal is to enable robots to perform functional tasks in emotive ways, be
it in response to their users' emotional states, or expressive of their
confidence levels. Prior work has proposed learning independent cost functions
from user feedback for each target emotion, so that the robot may optimize it
alongside task and environment specific objectives for any situation it
encounters. However, this approach is inefficient when modeling multiple
emotions and unable to generalize to new ones. In this work, we leverage the
fact that emotions are not independent of each other: they are related through
a latent space of Valence-Arousal-Dominance (VAD). Our key idea is to learn a
model for how trajectories map onto VAD with user labels. Considering the
distance between a trajectory's mapping and a target VAD allows this single
model to represent cost functions for all emotions. As a result 1) all user
feedback can contribute to learning about every emotion; 2) the robot can
generate trajectories for any emotion in the space instead of only a few
predefined ones; and 3) the robot can respond emotively to user-generated
natural language by mapping it to a target VAD. We introduce a method that
interactively learns to map trajectories to this latent space and test it in
simulation and in a user study. In experiments, we use a simple vacuum robot as
well as the Cassie biped
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
Gestures that accompany speech are an essential part of natural and efficient
embodied human communication. The automatic generation of such co-speech
gestures is a long-standing problem in computer animation and is considered an
enabling technology in film, games, virtual social spaces, and for interaction
with social robots. The problem is made challenging by the idiosyncratic and
non-periodic nature of human co-speech gesture motion, and by the great
diversity of communicative functions that gestures encompass. Gesture
generation has seen surging interest recently, owing to the emergence of more
and larger datasets of human gesture motion, combined with strides in
deep-learning-based generative models, that benefit from the growing
availability of data. This review article summarizes co-speech gesture
generation research, with a particular focus on deep generative models. First,
we articulate the theory describing human gesticulation and how it complements
speech. Next, we briefly discuss rule-based and classical statistical gesture
synthesis, before delving into deep learning approaches. We employ the choice
of input modalities as an organizing principle, examining systems that generate
gestures from audio, text, and non-linguistic input. We also chronicle the
evolution of the related training data sets in terms of size, diversity, motion
quality, and collection method. Finally, we identify key research challenges in
gesture generation, including data availability and quality; producing
human-like motion; grounding the gesture in the co-occurring speech in
interaction with other speakers, and in the environment; performing gesture
evaluation; and integration of gesture synthesis into applications. We
highlight recent approaches to tackling the various key challenges, as well as
the limitations of these approaches, and point toward areas of future
development.Comment: Accepted for EUROGRAPHICS 202
Advanced Content and Interface Personalization through Conversational Behavior and Affective Embodied Conversational Agents
Conversation is becoming one of the key interaction modes in HMI. As a result, the conversational agents (CAs) have become an important tool in various everyday scenarios. From Apple and Microsoft to Amazon, Google, and Facebook, all have adapted their own variations of CAs. The CAs range from chatbots and 2D, carton-like implementations of talking heads to fully articulated embodied conversational agents performing interaction in various concepts. Recent studies in the field of face-to-face conversation show that the most natural way to implement interaction is through synchronized verbal and co-verbal signals (gestures and expressions). Namely, co-verbal behavior represents a major source of discourse cohesion. It regulates communicative relationships and may support or even replace verbal counterparts. It effectively retains semantics of the information and gives a certain degree of clarity in the discourse. In this chapter, we will represent a model of generation and realization of more natural machine-generated output
- …