88,054 research outputs found
Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks
A major challenge for the realization of intelligent robots is to supply them
with cognitive abilities in order to allow ordinary users to program them
easily and intuitively. One way of such programming is teaching work tasks by
interactive demonstration. To make this effective and convenient for the user,
the machine must be capable to establish a common focus of attention and be
able to use and integrate spoken instructions, visual perceptions, and
non-verbal clues like gestural commands. We report progress in building a
hybrid architecture that combines statistical methods, neural networks, and
finite state machines into an integrated system for instructing grasping tasks
by man-machine interaction. The system combines the GRAVIS-robot for visual
attention and gestural instruction with an intelligent interface for speech
recognition and linguistic interpretation, and an modality fusion module to
allow multi-modal task-oriented man-machine communication with respect to
dextrous robot manipulation of objects.Comment: 7 pages, 8 figure
Presenting the networked home: a content analysis of promotion material of Ambient Intelligence applications
Ambient Intelligence (AmI) for the home uses information and communication technologies to make users’ everyday life more comfortable. AmI is still in its developmental phase and is headed towards the first stages of diffusion. \ud
Characteristics of AmI design can be observed, among others, in the promotion material of initial producers. A literature study revealed that AmI originally envisioned a central role for the user, convenience that AmI offers them and that attention should be paid to critical policy issues such as privacy and a potential loss of freedom. A content analysis of current promotion material of several high-tech companies revealed that these original ideas are not all reflected in the material. Attributes which were used most in the promotion material were ‘connectedness’, ‘control’, ‘easiness’ and ‘personalization’. An analysis of the pictures in the promotion material showed that almost half of the pictures contained no humans but appliances. These results only partly correspond to the original vision on AmI, since the emphasis is now on technology. The results represent a serious problem, since both users, as well as critical policy issues are underexposed in the current promotion material
Speech-Gesture Mapping and Engagement Evaluation in Human Robot Interaction
A robot needs contextual awareness, effective speech production and
complementing non-verbal gestures for successful communication in society. In
this paper, we present our end-to-end system that tries to enhance the
effectiveness of non-verbal gestures. For achieving this, we identified
prominently used gestures in performances by TED speakers and mapped them to
their corresponding speech context and modulated speech based upon the
attention of the listener. The proposed method utilized Convolutional Pose
Machine [4] to detect the human gesture. Dominant gestures of TED speakers were
used for learning the gesture-to-speech mapping. The speeches by them were used
for training the model. We also evaluated the engagement of the robot with
people by conducting a social survey. The effectiveness of the performance was
monitored by the robot and it self-improvised its speech pattern on the basis
of the attention level of the audience, which was calculated using visual
feedback from the camera. The effectiveness of interaction as well as the
decisions made during improvisation was further evaluated based on the
head-pose detection and interaction survey.Comment: 8 pages, 9 figures, Under review in IRC 201
Hyperpresent avatars
This paper will discuss two student projects, which were developed during a hybrid course between art/design and computer sciences at Sabancı University; both of which involve the creation of two avatars whose visual attributes are determined by data feeds from ‘Real Life’ sources by following up from Biocca's concept of the Cyborg’s Dilemma, we will describe the creative and technological processes which went into the materialization of these two avatars
Semi-Supervised Speech Emotion Recognition with Ladder Networks
Speech emotion recognition (SER) systems find applications in various fields
such as healthcare, education, and security and defense. A major drawback of
these systems is their lack of generalization across different conditions. This
problem can be solved by training models on large amounts of labeled data from
the target domain, which is expensive and time-consuming. Another approach is
to increase the generalization of the models. An effective way to achieve this
goal is by regularizing the models through multitask learning (MTL), where
auxiliary tasks are learned along with the primary task. These methods often
require the use of labeled data which is computationally expensive to collect
for emotion recognition (gender, speaker identity, age or other emotional
descriptors). This study proposes the use of ladder networks for emotion
recognition, which utilizes an unsupervised auxiliary task. The primary task is
a regression problem to predict emotional attributes. The auxiliary task is the
reconstruction of intermediate feature representations using a denoising
autoencoder. This auxiliary task does not require labels so it is possible to
train the framework in a semi-supervised fashion with abundant unlabeled data
from the target domain. This study shows that the proposed approach creates a
powerful framework for SER, achieving superior performance than fully
supervised single-task learning (STL) and MTL baselines. The approach is
implemented with several acoustic features, showing that ladder networks
generalize significantly better in cross-corpus settings. Compared to the STL
baselines, the proposed approach achieves relative gains in concordance
correlation coefficient (CCC) between 3.0% and 3.5% for within corpus
evaluations, and between 16.1% and 74.1% for cross corpus evaluations,
highlighting the power of the architecture
- …