13 research outputs found
Audience - Performer Engagement in Live Dance
PhD thesisIn live performances seated audiences have restricted opportunities for response, most commonly through cheering and applause at the end. However, audiences make other apparently incidental movements during a performance such as fixing hair, adjusting glasses, scratching ears, supporting their chin or shifting their bodies in the chair to change posture. The question we address here is whether these apparently incidental movements may provide systematic clues about people’s level of engagement with a performance. Our programmatic hypothesis is that audiences’ ongoing responses are part of a bi-directional system of audience-performer communication that distinguishes live from recorded performance. What could performers be detecting in these situations that informs their dynamic sense of how well a performance is going? Existing audience research has mostly focused on the non-visible or self-reported responses, while little is known about the overt audience responses. The main aim of this research is to uncover these audience responses and examine whether they may provide an indication of audience engagement and thereby form part of a feedback cycle between the performers and their audience. This thesis investigates this in the hardest case of contemporary dance where the production and setting should make audience responses hard to detect. A series of live performance studies is conducted in real theatrical settings in UK. This requires the development of methods capable of capturing continuous responses of the audience and the dancers and making sense of the resulting multi-modal data. Video recordings of performers and audience are analysed using computer vision techniques to extract face and body movement data while audience hand movement is captured using specialised wearable devices. The results show that while there is no systematic relationship between the responses of audience and dancers, audience members body movements do signal their levels of engagement to the dancers. The empirical findings of this thesis provide evidence that stillness and blank expressions are characteristic markers of cognitive engagement during performance whereas movement and hand to face gestures typically signal restlessness or boredom. This work argues that the audience’s overt responses matter and are an important characteristic of the live experience. The audience responses that have been disclosed in this thesis can provide a systematic basis to design for audiences and suggest new forms of live experience more focused on the audience.(EPSRC) as part of the Doctoral Training Centre in Media and Arts Technology at Queen Mary University of London (ref: EP/G03723X/1)
A contribution to robust person perception on real-world assistance robots in domestic scenarios
Die höhere Lebenserwartung der Bevölkerung und eine rückläufige Geburtenrate führen zu einem steigenden Anteil älterer Menschen in der Gesellschaft. Mobile Assistenzroboter sollen ältere Personen zukünftig in ihren Wohnungen unterstützen. Um sinnvolle Funktionen und Dienste anbieten zu können, muss der Roboter Personen in seiner Umgebung wahrnehmen können. Das häusliche Szenario stellt dabei aufgrund seiner Komplexität eine Herausforderung für die Erkennungsalgorithmen dar. Komplexität entsteht beispielsweise durch unterschiedliche Einrichtungsmöglichkeiten, schwierige Beleuchtungsbedingungen und variable Nutzerposen. Die Dissertation stellt eine Architektur zur Personenwahrnehmung für mobile Roboter vor. Die modulare Architektur beschreibt die verwendeten Komponenten und deren Kommunikation untereinander. Aufgrund der Modularität können einzelne Komponenten schnell integriert oder ausgetauscht werden. Die Arbeit evaluiert eine Vielzahl von multi-modalen Detektionsverfahren auf Basis von Laser-, Kamera- und 3D-Tiefendaten. Ausgewählte Algorithmen werden für Anwendungsszenario angepasst und weiterentwickelt. Die Hypothesen der Detektoren werden durch einen Personentracker raumzeitlich gefiltert und fusioniert. Besonderheiten des Personentrackers umfassen die Unterstützung mehrerer Filter und Systemmodelle, die Integration von nicht unabhängigen und verspäteten Beobachtungen, die Schätzung der Existenzwahrscheinlichkeit sowie die Integration von Umgebungswissen. Um Nutzer, welche sich nicht im Sichtbereich des Roboters befinden, in der Wohnung zu finden, werden verschiedene Suchverfahren vorgestellt. Das fortschrittlichste Verfahren verwendet eine explorative Suche, um die gesamte Wohnung effektiv zu durchsuchen. Dabei werden falsch-positiv Detektionen ausgeschlossen und mit dynamischen Hindernissen und nicht erreichbaren Räumen umgegangen. Die Arbeit stellt ein Verfahren für die Erkennung von gestürzten, am Boden liegenden Personen vor. Die auf Tiefendaten basierende Erkennung erlaubt es dem Roboter, Personen von anderen Objekten oder Tieren in der Wohnung zu unterscheiden. Die entwickelten Algorithmen wurden im realen Anwendungsszenario evaluiert, indem der Roboter für bis zu 3 Tage in den Wohnungen von Senioren zur freien Nutzung verblieb. Die Experimente zeigten, dass die vorgestellte Architektur zur Personenwahrnehmung robust genug arbeitet, damit der Roboter mithilfe seiner Dienste einen Mehrwert für die Senioren liefern kann.The increased life expectancy of the population and declining birth rates lead to an increasing proportion of elderly people in the modern society and hence an increasing need for age care. Mobile robots can assist users in theirs homes by means of services and companionship. To provide useful functionalities, the robot must be able to observe the user in the environment. The domestic scenario poses a challenge for people detection and tracking algorithms through its complexity caused, among others, by variable furnishing options, difficult lighting conditions and various user poses. This thesis presents an architecture for people detection and tracking for mobile robots in domestic environments. The modular architecture describes the used components and their communication with each other. Due to the modularity of design, the individual components can be easily integrated or exchanged. This work evaluates a variety of multi-modal detection methods based on laser data, camera data and 3D depth data. Suitable algorithms are being applied, adapted and enhanced. The detections are processed by a person tracker to allow for spatial-temporal filtering. Important features of the person tracker include the support of multiple filters and system models, the integration of coupled observations and out-of-sequence measurements, the estimation of the existence probability and the integration of environmental knowledge. The thesis proposes various methods to search and locate users in the apartment, which have left the robots limited field-of-view. The most advanced method uses an exploratory search method to examine the environment effectively. It handles false positive detections, dynamic obstacles and inaccessible rooms in a reasonable manner. Furthermore, this work presents a method to detect people that have fallen to the ground given occlusion. The method uses the depth data of a Kinect sensor mounted on the mobile robot. Point clouds are segmented, layered and classified to distinguish fallen people from furniture, household objects and animals. The developed algorithms were evaluated in a real-world scenario, by allowing the robot to stay in retirement homes for up to three days. The experiments showed that the presented architecture for people detection and tracking is robust enough, so that the robot's services proved to provide an added value to the seniors citizens
Affective Signals as Implicit Indicators of Information Relevancy and Information Processing Strategies
Search engines have become better in providing information to users, however, they still face major challenges such as determining how searchers process information, how they make relevance judgments, and how their cognitive or emotional state affect their search progress. We address these challenges by exploring searchers' affective dimension. In particular, we investigate how feelings, facial expressions, and electrodermal activity (EDA) could help to understand information relevancy, search progress, and information processing strategies (IPSs). To meet this goal, we designed an experiment with 45 participants exposed to affective stimuli prior solving a precision-oriented search task. Results indicate that initial affective states are linked to IPSs. In addition, we found that smiles act as implicit indicators of information relevancy and IPSs. Moreover, results convey that both smiles and EDA may serve as implicit indicators of progress and completion of search tasks. Findings from this work have practical implications in areas such as personalization and relevance feedback.ye
Emotions in context: examining pervasive affective sensing systems, applications, and analyses
Pervasive sensing has opened up new opportunities for measuring our feelings and understanding our behavior by monitoring our affective states while mobile. This review paper surveys pervasive affect sensing by examining and considering three major elements of affective pervasive systems, namely; “sensing”, “analysis”, and “application”. Sensing investigates the different sensing modalities that are used in existing real-time affective applications, Analysis explores different approaches to emotion recognition and visualization based on different types of collected data, and Application investigates different leading areas of affective applications. For each of the three aspects, the paper includes an extensive survey of the literature and finally outlines some of challenges and future research opportunities of affective sensing in the context of pervasive computing
Who's afraid of job interviews? Definitely a question for user modelling
We define job interviews as a domain of interaction that can be modelled automatically in a serious game for job interview skills training. We present four types of studies: (1) field-based human-to-human job interviews, (2) field-based computer-mediated human-to-human interviews, (3) lab-based wizard of oz studies, (4) field-based human-to agent studies. Together, these highlight pertinent questions for the user modelling eld as it expands its scope to applications for social inclusion. The results of the studies show that the interviewees suppress their emotional behaviours and although our system recognises automatically a subset of those behaviours, the modelling of complex mental states in real-world contexts poses a challenge for the state-of-the-art user modelling technologies. This calls for the need to re-examine both the approach to the implementation of the models and/or of their usage for the target contexts
Design and Experimental Evaluation of a Context-aware Social Gaze Control System for a Humanlike Robot
Nowadays, social robots are increasingly being developed for a variety of human-centered scenarios in which they interact with people. For this reason, they should possess the ability to perceive and interpret human non-verbal/verbal communicative cues, in a humanlike way. In addition, they should be able to autonomously identify the most important interactional target at the proper time by exploring the perceptual information, and exhibit a believable behavior accordingly. Employing a social robot with such capabilities has several positive outcomes for human society.
This thesis presents a multilayer context-aware gaze control system that has been implemented as a part of a humanlike social robot. Using this system the robot is able to mimic the human perception, attention, and gaze behavior in a dynamic multiparty social interaction.
The system enables the robot to direct appropriately its gaze at the right time to the environmental targets and humans who are interacting with each other and with the robot. For this reason, the attention mechanism of the gaze control system is based on features that have been proven to guide human attention: the verbal and non-verbal cues, proxemics, the effective field of view, the habituation effect, and the low-level visual features. The gaze control system uses skeleton tracking and speech recognition,facial expression recognition, and salience detection to implement the same features.
As part of a pilot evaluation, the gaze behavior of 11 participants was collected with a professional eye-tracking device, while they were watching a video of two-person interactions. Analyzing the average gaze behavior of participants, the importance of human-relevant features in
human attention triggering were determined. Based on this finding, the parameters of the gaze control system were tuned in order to imitate the human behavior in selecting features of environment.
The comparison between the human gaze behavior and the gaze behavior of the developed system running on the same videos shows that the proposed approach is promising as it replicated human gaze behavior 89% of the time
Ein Computermodell für die Simulation von emotionalen Angleichungsprozessen in der Mensch-Roboter Interaktion
Damm O. Ein Computermodell für die Simulation von emotionalen Angleichungsprozessen in der Mensch-Roboter Interaktion. Bielefeld: Universität Bielefeld; 2014.Es gibt seit über 20 Jahren unterschiedliche Ansätze, virtuelle Agenten und humanoide
Roboter sozialer und menschlicher erscheinen zu lassen. Um diesem Ziel näher zu kommen,
wird in unterschiedliche Richtungen geforscht. Die künstlichen Interaktionspartner
haben das Hören und Sprechen gelernt, um die Interaktion angenehmer und einfacher
zu machen. Es wurden Modelle entwickelt die es möglich machen, komplexe Dialoge
mit ihnen zu führen und nicht zuletzt wurden unterschiedliche Emotionsmodelle implementiert.
Viele Modelle von artifiziellen Emotionen versuchen über unterschiedlichen
Input einen internen emotionalen Zustand zu errechnen und diesen durch den Roboter
darzustellen. Diese Modelle reichen vom diskreten OCC-Modell, bei dem Emotionen
eine wertende Reaktion auf Konsequenzen von Ereignissen, Handlungen von Agenten
oder Aspekte von Objekten sind, bis hin zu multidimensionalen Modellen die versuchen
natürliche Emotionen zu simulieren. Für das OCC-Modell bedeutet das zum Beispiel,
dass eine Handlung mehr oder weniger zu einer Emotion führt.
Bei den dimensionalen Modellen wird der emotionale Zustand über einen Punkt in
einem 3 dimensionalen Raum modelliert. Dieser Punkt wird durch wahrgenommene
Handlungen oder Ereignisse im Raum bewegt. Die Emotionen sind unterschiedlichen
Bereichen in diesem Raum zugeordnet. Das hier vorgestellte Modell basiert auf Erkenntnissen,
die zuvor in mehreren empirischen Studien gewonnen wurden. Es simuliert
emotionale Angleichungsprozesse, die in der Mensch-Mensch Interaktion beobachtet
werden können. Es wird also nicht versucht, dem Roboter einen emotionalen Zustand
zu ”geben“, vielmehr liegt der Focus auf der Interaktion und der Wirkung einer gezeigten
Emotion auf eben diese. Dafür wurde ein Ebenen-Modell implementiert, das dem
Roboter ermöglicht in unterschiedlichen Situationen emotional angemessen zu reagieren.
Es setzt (Facial) Mimicry ein, damit der Roboter positiver wahrgenommen wird
und ein Social Bonding zu etablieren. Des Weiteren werden Emotionen eingesetzt, um
die Interaktion gezielt zu beeinflussen und um auf Ereignisse, die nicht direkt zur Interaktion gehören, zu reagieren
Human-robot interaction system based on multimodal and adaptive dialogs
Mención Internacional en el título de doctorDurante los últimos años, en el área de la Interacción Humano-Robot (HRI), ha
sido creciente el estudio de la interacción en la que participan usuarios no entrenados
tecnológicamente con sistemas robóticos. Para esta población de usuarios potenciales,
es necesario utilizar técnicas de interacción que no precisen de conocimientos previos
específicos. En este sentido, al usuario no se le debe presuponer ningún tipo de habilidad
tecnológica: la única habilidad interactiva que se le puede presuponer al usuario
es la que le permite interaccionar con otros humanos. Las técnicas desarrolladas y
expuestas en este trabajo tienen como finalidad, por un lado que el sistema/robot se
exprese de modo y manera que esos usuarios puedan comprenderlo, sin necesidad de
hacer un esfuerzo extra con respecto a la interacción con personas. Por otro lado, que
el sistema/robot interprete lo que esos usuarios expresen sin que tengan que hacerlo
de modo distinto a como lo harían para comunicarse con otra persona. En definitiva,
se persigue imitar a los seres humanos en su manera de interactuar.
En la presente se ha desarrollado y probado un sistema de interacción natural, que
se ha denominado Robotics Dialog System (RDS). Permite una interacción entre el
robot y el usuario usando los diversos canales de comunicación disponibles. El sistema
completo consta de diversos módulos, que trabajando de una manera coordinada
y complementaria, trata de alcanzar los objetivos de interacción natural deseados.
RDS convive dentro de una arquitectura de control robótica y se comunica con el
resto de sistemas que la componen, como son los sistemas de: toma de decisiones,
secuenciación, comunicación, juegos, percepción sensoriales, expresión, etc.
La aportación de esta tesis al avance del estado del arte, se produce a dos niveles.
En un plano superior, se presenta el sistema de interacción humano-robot (RDS)
mediante diálogos multimodales. En un plano inferior, en cada capítulo se describen los componentes desarrollados expresamente para el sistema RDS, realizando contribuciones
al estado del arte en cada campo tratado. Previamente a cada aportación
realizada, ha sido necesario integrar y/o implementar los avances acaecidos en su
estado del arte hasta la fecha. La mayoría de estas contribuciones, se encuentran
respaldadas mediante publicación en revistas científicas.
En el primer campo en el que se trabajó, y que ha ido evolucionando durante todo
el proceso de investigación, fue en el campo del Procesamiento del Lenguaje Natural.
Se ha analizado y experimentado en situaciones reales, los sistemas más importantes
de reconocimiento de voz (ASR); posteriormente, algunos de ellos han sido integrados
en el sistema RDS, mediante un sistema que trabaja concurrentemente con varios
motores de ASR, con el doble objetivo de mejorar la precisión en el reconocimiento
de voz y proporcionar varios métodos de entrada de información complementarios.
Continuó la investigación, adaptando la interacción a los posibles tipos de micrófonos
y entornos acústicos. Se complementó el sistema con la capacidad de reconocer voz
en múltiples idiomas y de identificar al usuario por su tono de voz.
El siguiente campo de investigación tratado corresponde con la generación de
lenguaje natural. El objetivo ha sido lograr un sistema de síntesis verbal con cierto
grado de naturalidad e inteligibilidad, multilenguaje, con varios timbres de voz, y
que expresase emociones. Se construyó un sistema modular capaz de integrar varios
motores de síntesis de voz. Para dotar al sistema de cierta naturalidad y variabilidad
expresiva, se incorporó un mecanismo de plantillas, que permite sintetizar voz con
cierto grado de variabilidad léxica.
La gestión del diálogo constituyo el siguiente reto. Se analizaron los paradigmas
existentes, y se escogió un gestor basado en huecos de información. El gestor escogido
se amplió y modificó para potenciar la capacidad de adaptarse al usuario (mediante
perfiles) y tener cierto conocimiento del mundo. Conjuntamente, se desarrollo el
módulo de fusión multimodal, que se encarga de abstraer la multimodalidad al gestor
del diálogo, es decir, de abstraer al gestor del diálogo de los canales por los que se
recibe el mensaje comunicativo. Este módulo, surge como el resultado de adaptar la
teoría de actos comunicativos en la interacción entre humanos a nuestro sistema de
interacción. Su función es la de empaquetar la información sensorial emitida por los
módulos sensoriales de RDS (siguiendo un algoritmo de detección de actos comunicativos,
desarrollado para este trabajo), y entregarlos al gestor del diálogo en cada
turno del diálogo.
Para potenciar la multimodalidad, se añadieron nuevos modos de entrada al sistema.
El sistema de localización de usuarios, que en base al análisis de varias entradas
de información, entre ellas la sonora, consigue identificar y localizar los usuarios que
rodean al robot. La gestión de las emociones del robot y del usuario también forman
parte de las modos de entradas del sistema, para ello, la emoción del robot se genera
mediante un módulo externo de toma de decisiones, mientras que la emoción del usuario es percibida mediante el análisis de las características sonoras de su voz y de
las expresiones de su rostro. Por último, otras modos de entrada incorporados han
sido la lectura de etiquetas de radio frecuencia, y la lectura de texto escrito.
Por otro lado, se desarrollaron nuevos modos expresivos o de salida. Entre ellos
destacan la expresión de sonidos no-verbales generados en tiempo real, la capacidad de
cantar, y de expresar ciertos gestos “de enganche” que ayudan a mejorar la naturalidad
de la interacción: mirar al usuario, afirmaciones y negaciones con la cabeza, etc.In recent years, in the Human-Robot Interaction (HRI) area, there has been more
interest in situations where users are not technologically skilled with robotic systems.
For these users, it is necessary to use interactive techniques that don’t require previous
specific knowledge. Any technological skill must not be assumed for them; the only
one permitted is to communicate with other human users. The techniques that will
be shown in this work have the goal that the robot or system displays information
in a way that these users can understand it perfectly. In other words, in the same
way they would do with any other human, and the robot or system understands what
users are expressing. To sum up, the goal is to emulate how humans are interacting.
In this thesis a natural interaction system has been developed and tested, it has
been called Robotics Dialog System (RDS). It allows users and robotic communication
using different channels. The system is comprised of many modules that work together
co-ordinately to reach the desired natural interactivity levels. It has been designed
inside a robotic control architecture and communicates with all the other systems:
decision management system, sequencer, communication system, games, sensorial and
movement skills, etc. This thesis contributes to the state-of-the-art in two levels. First,
in a high level, it is shown a Human-Robot Interaction System (RDS) with multimodal
dialogs. Second, in the lower level, in each chapter the specifically designed
components for this RDS system will be described. All of them will contribute to the
state-of-the-art individually to their scientific subject. Before each contribution it has
been necessary to update them, either by integrating or implementing the state-ofthe-
art techniques. Most of them have been checked with scientific journal papers.
The first works were done in the Natural Language Processing system. Analysis and
experiments have been carried out with the most important existing voice recognition systems (ASR) in daily real situations. Then, some of them have been added into
the RDS system in a way that they are able to work concurrently, the goal was
to enhance the voice recognition precision and enable several complementary input
methods. Then, the research focus was move to adapt the interaction between several
types of microphones and acoustic environments. Finally, the system was extended to
be able to identify several languages and users, using for this later their voice tone.
The next system to be focused was the natural language generator, whose main
objectives within this thesis boundaries were to reach a certain level of intelligence
and naturalness, to be multilingual, to have several voice tones and to express emotions.
The system architecture was designed to be comprised of several modules and
abstraction layers because several voice synthesis engines needed to be integrated.
A pattern-based mechanism was also added to the system in order to give it some
natural variability and to generate non-predefined sentences in a conversation.
Then the Dialog Management System (DMS) was the next challenge. First of
all, the existing paradigms whose behaviour is based in filling information gaps were
analysed to choose the best one. Secondly, the system was modified and tailored to
be adapted to users (by means of user profiling) and finally, some general knowledge
was added (by using pre-defined files). At the same time the Multi-modal Module was
developed. Its goal is to abstract this multi-modality from the DMS, in other words,
the DMS system must use the message regardless the input channel the message
used to reach it. This module was created as a result of adapting the communicative
act theory in interactions between human beings to our interaction system. Its main
function is to gather the information from the RDS sensorial modules (following an
ad-hoc communicative act detection algorithm developed for this work) and to send
them to the DMS at every step of the communicative process. New modes were
integrated on the system to enhance this multi-modality such as the user location
system, which allows the robot to know the position around it where the users are
located by analysing a set of inputs, including sound. Other modes added to the
system are the radio frequency tag reader and the written text reader. In addition,
the robot and user emotion management have been added to the available inputs, and
then, taken into account. To fulfil this requirement, the robot emotions are generated
by an external decision-maker software module while the user emotions are captured
by means of acoustic voice analysis and artificial vision techniques applied to the user
face. Finally, new multi-modal expressive components, which make the interaction
more natural, were developed: the capacity of generating non-textual real-time sounds,
singing skills and some other gestures such as staring at the user, nodding, etc.Programa Oficial de Doctorado en Ingeniería Eléctrica, Electrónica y AutomáticaPresidente: Carlos Balaguer Bernaldo de Quirós.- Vocal: Antonio Barrientos Cru