16 research outputs found

    Perception of Blended Emotions: From Video Corpus to Expressive Agent

    Full text link
    Abstract. Real life emotions are often blended and involve several simultane-ous superposed or masked emotions. This paper reports on a study on the per-ception of multimodal emotional behaviors in Embodied Conversational Agents. This experimental study aims at evaluating if people detect properly the signs of emotions in different modalities (speech, facial expressions, gestures) when they appear to be superposed or masked. We compared the perception of emotional behaviors annotated in a corpus of TV interviews and replayed by an expressive agent at different levels of abstraction. The results provide insights on the use of such protocols for studying the effect of various models and modalities on the perception of complex emotions.

    Synthesis of listener vocalizations : towards interactive speech synthesis

    Get PDF
    Spoken and multi-modal dialogue systems start to use listener vocalizations, such as uh-huh and mm-hm, for natural interaction. Generation of listener vocalizations is one of the major objectives of emotionally colored conversational speech synthesis. Success in this endeavor depends on the answers to three questions: Where to synthesize a listener vocalization? What meaning should be conveyed through the synthesized vocalization? And, how to realize an appropriate listener vocalization with the intended meaning? This thesis addresses the latter question. The investigation starts with proposing a three-stage approach: (i) data collection, (ii) annotation, and (iii) realization. The first stage presents a method to collect natural listener vocalizations from German and British English professional actors in a recording studio. In the second stage, we explore a methodology for annotating listener vocalizations -- meaning and behavior (form) annotation. The third stage proposes a realization strategy that uses unit selection and signal modification techniques to generate appropriate listener vocalizations upon user requests. Finally, we evaluate naturalness and appropriateness of synthesized vocalizations using perception studies. The work is implemented in the open source MARY text-to-speech framework, and it is integrated into the SEMAINE project\u27s Sensitive Artificial Listener (SAL) demonstrator.Dialogsysteme nutzen zunehmend Hörer-Vokalisierungen, wie z.B. a-ha oder mm-hm, für natürliche Interaktion. Die Generierung von Hörer-Vokalisierungen ist eines der zentralen Ziele emotional gefärbter, konversationeller Sprachsynthese. Ein Erfolg in diesem Unterfangen hängt von den Antworten auf drei Fragen ab: Wo bzw. wann sollten Vokalisierungen synthetisiert werden? Welche Bedeutung sollte in den synthetisierten Vokalisierungen vermittelt werden? Und wie können angemessene Hörer-Vokalisierungen mit der intendierten Bedeutung realisiert werden? Diese Arbeit widmet sich der letztgenannten Frage. Die Untersuchung erfolgt in drei Schritten: (i) Korpuserstellung; (ii) Annotation; und (iii) Realisierung. Der erste Schritt präsentiert eine Methode zur Sammlung natürlicher Hörer-Vokalisierungen von deutschen und britischen Profi-Schauspielern in einem Tonstudio. Im zweiten Schritt wird eine Methodologie zur Annotation von Hörer-Vokalisierungen erarbeitet, die sowohl Bedeutung als auch Verhalten (Form) umfasst. Der dritte Schritt schlägt ein Realisierungsverfahren vor, die Unit-Selection-Synthese mit Signalmodifikationstechniken kombiniert, um aus Nutzeranfragen angemessene Hörer-Vokalisierungen zu generieren. Schließlich werden Natürlichkeit und Angemessenheit synthetisierter Vokalisierungen mit Hilfe von Hörtests evaluiert. Die Methode wurde im Open-Source-Sprachsynthesesystem MARY implementiert und in den Sensitive Artificial Listener-Demonstrator im Projekt SEMAINE integriert

    An architecture for emotional facial expressions as social signals

    Get PDF

    Collaborative learning with affective artificial study companions in a virtual learning environment

    Get PDF
    This research has been carried out in conjunction with Chapeltown and Harehills Assisted Learning Computer School (CHALCS) and local schools. CHALCS is an 'out-of-hours' school in a deprived inner-city community where unemployment is high and many children are failing to meet their educational potential. As the name implies CHALCS provides students with access to computers to support their learning. CHALCS relies on many volunteer tutors and specialist tutors are in short supply. This is especially true for subjects such as Advanced Level Physics with low numbers of students. This research aimed to investigate the feasibility of providing online study skills support to pupils at CHALCS and a local school. Research suggests that collaborative learning that prompts students to explain and justify their understanding can encourage deeper learning. As a potentially effective way of motivating deeper learning from hypertext course notes in a Virtual Learning Environment (VLE), this research investigates the feasibility of designing an artificial Agent capable of collaborating with the learner to jointly construct summary notes. Hypertext course notes covering a portion of the Advanced Level Physics curriculum were designed and uploaded into a WebCT based VLE. A specialist tutor validated the content of the course notes before the ease of use of the VLE was tested with target students. A study was then conducted to develop a model of the kinds of help students required in writing summary notes from the course-notes. Based on the derived process model of summarisation and an analysis of the content structure of the course notes, strategies for summarising the text were devised. An Animated Pedagogical Agent was designed incorporating these strategies. Two versions of the agent with opposing 'Affectations' (giving the appearance of different characters) were evaluated with users. It was therefore possible to test which artificial 'character' students preferred. From the evaluation study some conclusions are made concerning the effect of the two opposite characterisations on student perceptions of the agent and the degree to which it was helpful as a learning companion. Some recommendations for future work are then made

    Ubiquitous User Modeling

    Get PDF
    More and more interactions take place between humans and mobile or connected IT-systems in daily life. This offers a great opportunity, especially to user modeling, to reach better adaptation with ongoing evaluation of user behavior. This work develops a complete framework to realize the newly defined concept of ubiquitous user modeling. The developed tools cover methods for the uniform exchange and the semantic integration of partial user models. They also account for the extended needs for privacy and the right of every human for introspection and control of their collected data. The SITUATIONALSTATEMENTS and the exchange language USERML have been developed on the syntactical level, while the general user model ontology GUMO and the UBISWORLD ontology have been developed on the semantical level. A multilevel conflict resolution method, which handles the problem of contradictory statements, has been implemented together with a web-based user model service, such that the road capability and the scalability can be proven with this approach.Immer häufiger auftretende Interaktionen im täglichen Leben zwischen Menschen und vernetzten oder mobilen IT-Systemen bieten insbesondere für die Benutzermodellierung eine große Chance, durch ständige Evaluation des Benutzerverhaltens verbesserte Adaptionsleistungen zu erzielen. Die vorliegende Arbeit entwickelt ein komplettes Rahmensystem, um dieses neu definierte Konzept der ubiquitären Benutzermodellierung zu realisieren. Die erarbeiteten Werkzeuge umfassen Methoden zum einheitlichen Austausch und zur semantischen Integration von partiellen Benutzermodellen. Sie berücksichtigen aber auch die erhöhten Anforderungen an die Privatsphäre, sowie das Recht der Menschen auf Introspektion und Kontrolle über die erhobenen Daten. Auf syntaktischer Ebene werden die situationsbeschreibenden Aussagen sowie die Austauschsprache UserML entworfen. Auf semantischer Ebene werden die allgemeine Benutzermodell-Ontologie GUMO und die UBISWELT-Ontologie entwickelt. Ein mehrstufiger Konfliktlösungsmechanismus, der das Problem sich widersprechender Aussagen bearbeitet, wird zusammen mit einem webbasierten Benutzermodell-Service implementiert, sodass die Praxistauglichkeit und die Skalierbarkeit dieses Ansatzes an mehreren Beispielen gezeigt werden kann

    Affective Computing

    Get PDF
    This book provides an overview of state of the art research in Affective Computing. It presents new ideas, original results and practical experiences in this increasingly important research field. The book consists of 23 chapters categorized into four sections. Since one of the most important means of human communication is facial expression, the first section of this book (Chapters 1 to 7) presents a research on synthesis and recognition of facial expressions. Given that we not only use the face but also body movements to express ourselves, in the second section (Chapters 8 to 11) we present a research on perception and generation of emotional expressions by using full-body motions. The third section of the book (Chapters 12 to 16) presents computational models on emotion, as well as findings from neuroscience research. In the last section of the book (Chapters 17 to 22) we present applications related to affective computing

    Synthesising prosody with insufficient context

    Get PDF
    Prosody is a key component in human spoken communication, signalling emotion, attitude, information structure, intention, and other communicative functions through perceived variation in intonation, loudness, timing, and voice quality. However, the prosody in text-to-speech (TTS) systems is often monotonous and adds no additional meaning to the text. Synthesising prosody is difficult for several reasons: I focus on three challenges. First, prosody is embedded in the speech signal, making it hard to model with machine learning. Second, there is no clear orthography for prosody, meaning it is underspecified in the input text and making it difficult to directly control. Third, and most importantly, prosody is determined by the context of a speech act, which TTS systems do not, and will never, have complete access to. Without the context, we cannot say if prosody is appropriate or inappropriate. Context is wide ranging, but state-of-the-art TTS acoustic models only have access to phonetic information and limited structural information. Unfortunately, most context is either difficult, expensive, or impos- sible to collect. Thus, fully specified prosodic context will never exist. Given there is insufficient context, prosody synthesis is a one-to-many generative task: it necessitates the ability to produce multiple renditions. To provide this ability, I propose methods for prosody control in TTS, using either explicit prosody features, such as F0 and duration, or learnt prosody representations disentangled from the acoustics. I demonstrate that without control of the prosodic variability in speech, TTS will produce average prosody—i.e. flat and monotonous prosody. This thesis explores different options for operating these control mechanisms. Random sampling of a learnt distribution of prosody produces more varied and realistic prosody. Alternatively, a human-in-the-loop can operate the control mechanism—using their intuition to choose appropriate prosody. To improve the effectiveness of human-driven control, I design two novel approaches to make control mechanisms more human interpretable. Finally, it is important to take advantage of additional context as it becomes available. I present a novel framework that can incorporate arbitrary additional context, and demonstrate my state-of- the-art context-aware model of prosody using a pre-trained and fine-tuned language model. This thesis demonstrates empirically that appropriate prosody can be synthesised with insufficient context by accounting for unexplained prosodic variation

    Model driven design and data integration in semantic web information systems

    Get PDF
    The Web is quickly evolving in many ways. It has evolved from a Web of documents into a Web of applications in which a growing number of designers offer new and interactive Web applications with people all over the world. However, application design and implementation remain complex, error-prone and laborious. In parallel there is also an evolution from a Web of documents into a Web of `knowledge' as a growing number of data owners are sharing their data sources with a growing audience. This brings the potential new applications for these data sources, including scenarios in which these datasets are reused and integrated with other existing and new data sources. However, the heterogeneity of these data sources in syntax, semantics and structure represents a great challenge for application designers. The Semantic Web is a collection of standards and technologies that offer solutions for at least the syntactic and some structural issues. If offers semantic freedom and flexibility, but this leaves the issue of semantic interoperability. In this thesis we present Hera-S, an evolution of the Model Driven Web Engineering (MDWE) method Hera. MDWEs allow designers to create data centric applications using models instead of programming. Hera-S especially targets Semantic Web sources and provides a flexible method for designing personalized adaptive Web applications. Hera-S defines several models that together define the target Web application. Moreover we implemented a framework called Hydragen, which is able to execute the Hera-S models to run the desired Web application. Hera-S' core is the Application Model (AM) in which the main logic of the application is defined, i.e. defining the groups of data elements that form logical units or subunits, the personalization conditions, and the relationships between the units. Hera-S also uses a so-called Domain Model (DM) that describes the content and its structure. However, this DM is not Hera-S specific, but instead allows any Semantic Web source representation as its DM, as long as its content can be queried by the standardized Semantic Web query language SPARQL. The same holds for the User Model (UM). The UM can be used for personalization conditions, but also as a source of user-related content if necessary. In fact, the difference between DM and UM is conceptual as their implementation within Hydragen is the same. Hera-S also defines a presentation model (PM) which defines presentation details of elements like order and style. In order to help designers with building their Web applications we have introduced a toolset, Hera Studio, which allows to build the different models graphically. Hera Studio also provides some additional functionality like model checking and deployment of the models in Hydragen. Both Hera-S and its implementation Hydragen are designed to be flexible regarding the user of models. In order to achieve this Hydragen is a stateless engine that queries for relevant information from the models at every page request. This allows the models and data to be changed in the datastore during runtime. We show that one way to exploit this flexibility is by applying aspect-orientation to the AM. Aspect-orientation allows us to dynamically inject functionality that pervades the entire application. Another way to exploit Hera-S' flexibility is in reusing specialized components, e.g. for presentation generation. We present a configuration of Hydragen in which we replace our native presentation generation functionality by the AMACONT engine. AMACONT provides more extensive multi-level presentation generation and adaptation capabilities as well aspect-orientation and a form of semantic based adaptation. Hera-S was designed to allow the (re-)use of any (Semantic) Web datasource. It even opens up the possibility for data integration at the back end, by using an extendible storage layer in our database of choice Sesame. However, even though theoretically possible it still leaves much of the actual data integration issue. As this is a recurring issue in many domains, a broader challenge than for Hera-S design only, we decided to look at this issue in isolation. We present a framework called Relco which provides a language to express data transformation operations as well as a collection of techniques that can be used to (semi-)automatically find relationships between concepts in different ontologies. This is done with a combination of syntactic, semantic and collaboration techniques, which together provide strong clues for which concepts are most likely related. In order to prove the applicability of Relco we explore five application scenarios in different domains for which data integration is a central aspect. This includes a cultural heritage portal, Explorer, for which data from several datasources was integrated and was made available by a mapview, a timeline and a graph view. Explorer also allows users to provide metadata for objects via a tagging mechanism. Another application is SenSee: an electronic TV-guide and recommender. TV-guide data was integrated and enriched with semantically structured data from several sources. Recommendations are computed by exploiting the underlying semantic structure. ViTa was a project in which several techniques for tagging and searching educational videos were evaluated. This includes scenarios in which user tags are related with an ontology, or other tags, using the Relco framework. The MobiLife project targeted the facilitation of a new generation of mobile applications that would use context-based personalization. This can be done using a context-based user profiling platform that can also be used for user model data exchange between mobile applications using technologies like Relco. The final application scenario that is shown is from the GRAPPLE project which targeted the integration of adaptive technology into current learning management systems. A large part of this integration is achieved by using a user modeling component framework in which any application can store user model information, but which can also be used for the exchange of user model data

    Proceedings from NordiCHI 2008 Workshop Sunday October 19, 2008

    Get PDF
    This paper raises themes that are seen as some of the challenges facing the emerging practice and research field of Human Work Interaction Design. The paper has its offset in the discussions and writings that have been dominant within the IFIP Working Group on Human Work Interaction Design (name HWID) through the last two and half years since the commencement of this Working Group. The paper thus provides an introduction to the theory and empirical evidence that lie behind the combination of empirical work studies and interaction design. It also recommends key topics for future research in Human Work Interaction Design
    corecore