10,699 research outputs found
Integration of Action and Language Knowledge: A Roadmap for Developmental Robotics
“This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.”This position paper proposes that the study of embodied cognitive agents, such as humanoid robots, can advance our understanding of the cognitive development of complex sensorimotor, linguistic, and social learning skills. This in turn will benefit the design of cognitive robots capable of learning to handle and manipulate objects and tools autonomously, to cooperate and communicate with other robots and humans, and to adapt their abilities to changing internal, environmental, and social conditions. Four key areas of research challenges are discussed, specifically for the issues related to the understanding of: 1) how agents learn and represent compositional actions; 2) how agents learn and represent compositional lexica; 3) the dynamics of social interaction and learning; and 4) how compositional action and language representations are integrated to bootstrap the cognitive system. The review of specific issues and progress in these areas is then translated into a practical roadmap based on a series of milestones. These milestones provide a possible set of cognitive robotics goals and test scenarios, thus acting as a research roadmap for future work on cognitive developmental robotics.Peer reviewe
Emerging Linguistic Functions in Early Infancy
This paper presents results from experimental
studies on early language acquisition in infants and
attempts to interpret the experimental results within
the framework of the Ecological Theory of
Language Acquisition (ETLA) recently proposed
by (Lacerda et al., 2004a). From this perspective,
the infant’s first steps in the acquisition of the
ambient language are seen as a consequence of the
infant’s general capacity to represent sensory input
and the infant’s interaction with other actors in its
immediate ecological environment. On the basis of
available experimental evidence, it will be argued
that ETLA offers a productive alternative to
traditional descriptive views of the language
acquisition process by presenting an operative
model of how early linguistic function may emerge
through interaction
Audiovisual integration of emotional signals from others' social interactions
Audiovisual perception of emotions has been typically examined using displays of a solitary character (e.g., the face-voice and/or body-sound of one actor). However, in real life humans often face more complex multisensory social situations, involving more than one person. Here we ask if the audiovisual facilitation in emotion recognition previously found in simpler social situations extends to more complex and ecological situations. Stimuli consisting of the biological motion and voice of two interacting agents were used in two experiments. In Experiment 1, participants were presented with visual, auditory, auditory filtered/noisy, and audiovisual congruent and incongruent clips. We asked participants to judge whether the two agents were interacting happily or angrily. In Experiment 2, another group of participants repeated the same task, as in Experiment 1, while trying to ignore either the visual or the auditory information. The findings from both experiments indicate that when the reliability of the auditory cue was decreased participants weighted more the visual cue in their emotional judgments. This in turn translated in increased emotion recognition accuracy for the multisensory condition. Our findings thus point to a common mechanism of multisensory integration of emotional signals irrespective of social stimulus complexity
Augmented Kinesthetic Teaching: Enhancing Task Execution Efficiency through Intuitive Human Instructions
In this paper, we present a complete and efficient implementation of a
knowledge-sharing augmented kinesthetic teaching approach for efficient task
execution in robotics. Our augmented kinesthetic teaching method integrates
intuitive human feedback, including verbal, gesture, gaze, and physical
guidance, to facilitate the extraction of multiple layers of task information
including control type, attention direction, input and output type, action
state change trigger, etc., enhancing the adaptability and autonomy of robots
during task execution. We propose an efficient Programming by Demonstration
(PbD) framework for users with limited technical experience to teach the robot
in an intuitive manner. The proposed framework provides an interface for such
users to teach customized tasks using high-level commands, with the goal of
achieving a smoother teaching experience and task execution. This is
demonstrated with the sample task of pouring water
Aerospace medicine and biology: A continuing bibliography with indexes, supplement 125
This special bibliography lists 323 reports, articles, and other documents introduced into the NASA scientific and technical information system in January 1974
Acting rehearsal in collaborative multimodal mixed reality environments
This paper presents the use of our multimodal mixed reality telecommunication system to support remote acting rehearsal. The rehearsals involved two actors, located in London and Barcelona, and a director in another location in London. This triadic audiovisual telecommunication was performed in a spatial and multimodal collaborative mixed reality environment based on the 'destination-visitor' paradigm, which we define and put into use. We detail our heterogeneous system architecture, which spans the three distributed and technologically asymmetric sites, and features a range of capture, display, and transmission technologies. The actors' and director's experience of rehearsing a scene via the system are then discussed, exploring successes and failures of this heterogeneous form of telecollaboration. Overall, the common spatial frame of reference presented by the system to all parties was highly conducive to theatrical acting and directing, allowing blocking, gross gesture, and unambiguous instruction to be issued. The relative inexpressivity of the actors' embodiments was identified as the central limitation of the telecommunication, meaning that moments relying on performing and reacting to consequential facial expression and subtle gesture were less successful
UR-FUNNY: A Multimodal Language Dataset for Understanding Humor
Humor is a unique and creative communicative behavior displayed during social
interactions. It is produced in a multimodal manner, through the usage of words
(text), gestures (vision) and prosodic cues (acoustic). Understanding humor
from these three modalities falls within boundaries of multimodal language; a
recent research trend in natural language processing that models natural
language as it happens in face-to-face communication. Although humor detection
is an established research area in NLP, in a multimodal context it is an
understudied area. This paper presents a diverse multimodal dataset, called
UR-FUNNY, to open the door to understanding multimodal language used in
expressing humor. The dataset and accompanying studies, present a framework in
multimodal humor detection for the natural language processing community.
UR-FUNNY is publicly available for research
- …