17 research outputs found

    Neural Conversation Generation with Auxiliary Emotional Supervised Models

    Get PDF
    An important aspect of developing dialogue agents involves endowing a conversation system with emotion perception and interaction. Most existing emotion dialogue models lack the adaptability and extensibility of different scenes because of their limitation to require a specified emotion category or their reliance on a fixed emotional dictionary. To overcome these limitations, we propose a neural conversation generation with auxiliary emotional supervised model (nCG-ESM) comprising a sequence-to-sequence (Seq2Seq) generation model and an emotional classifier used as an auxiliary model. The emotional classifier was trained to predict the emotion distributions of the dialogues, which were then used as emotion supervised signals to guide the generation model to generate diverse emotional responses. The proposed nCG-ESM is flexible enough to generate responses with emotional diversity, including specified or unspecified emotions, which can be adapted and extended to different scenarios. We conducted extensive experiments on the popular dataset of Weibo post--response pairs. Experimental results showed that the proposed model was capable of producing more diverse, appropriate, and emotionally rich responses, yielding substantial gains in diversity scores and human evaluations.Peer reviewe

    Answering questions about archived, annotated meetings

    Get PDF
    Retrieving information from archived meetings is a new domain of information retrieval that has received increasing attention in the past few years. Search in spontaneous spoken conversations has been recognized as more difficult than text-based document retrieval because meeting discussions contain two levels of information: the content itself, i.e. what topics are discussed, but also the argumentation process, i.e. what conflicts are resolved and what decisions are made. To capture the richness of information in meetings, current research focuses on recording meetings in Smart-Rooms, transcribing meeting discussion into text and annotating discussion with semantic higher-level structures to allow for efficient access to the data. However, it is not yet clear what type of user interface is best suited for searching and browsing such archived, annotated meetings. Content-based retrieval with keyword search is too naive and does not take into account the semantic annotations on the data. The objective of this thesis is to assess the feasibility and usefulness of a natural language interface to meeting archives that allows users to ask complex questions about meetings and retrieve episodes of meeting discussions based on semantic annotations. The particular issues that we address are: the need of argumentative annotation to answer questions about meetings; the linguistic and domain-specific natural language understanding techniques required to interpret such questions; and the use of visual overviews of meeting annotations to guide users in formulating questions. To meet the outlined objectives, we have annotated meetings with argumentative structure and built a prototype of a natural language understanding engine that interprets questions based on those annotations. Further, we have performed two sets of user experiments to study what questions users ask when faced with a natural language interface to annotated meeting archives. For this, we used a simulation method called Wizard of Oz, to enable users to express questions in their own terms without being influenced by limitations in speech recognition technology. Our experimental results show that technically it is feasible to annotate meetings and implement a deep-linguistic NLU engine for questions about meetings, but in practice users do not consistently take advantage of these features. Instead they often search for keywords in meetings. When visual overviews of the available annotations are provided, users refer to those annotations in their questions, but the complexity of questions remains simple. Users search with a breadth-first approach, asking questions in sequence instead of a single complex question. We conclude that natural language interfaces to meeting archives are useful, but that more experimental work is needed to find ways to incent users to take advantage of the expressive power of natural language when asking questions about meetings

    Physiological Sensing for Affective Computing

    Get PDF
    This thesis addresses two aspects related to enabling systems to recognize the affective state of people and respond sensibly to it. First, the issue of representing affective states and unambiguously assigning physiological measurements to those is addressed by suggesting a new approach based on the dimensional emotion model of valence and arousal. Second, the issue of sensing affect-related physiological data is addressed by suggesting a concept for physiological sensor systems that live up to the requirements of adaptive, user-centred systems.In dieser Arbeit wird ein Konzept zur eindeutigen Zuordnung physiologischer Messdaten zu Emotionszuständen erarbeitet, wobei Probleme klassischer Ansätze hierzu vermieden werden. Des Weiteren widmet sich die Arbeit der Erfassung emotionsbezogener physiologischer Parameter. Es wird ein Konzept für Sensorsysteme vorgestellt, welches die zuverlässige Erfassung relevanter physiologischer Parameter erlaubt, ohne jedoch den Nutzer stark zu beeinträchtigen. Der Schwerpunkt liegt hierbei auf der alltagstauglichen Gestaltung des Systems

    Designing Embodied Interactive Software Agents for E-Learning: Principles, Components, and Roles

    Get PDF
    Embodied interactive software agents are complex autonomous, adaptive, and social software systems with a digital embodiment that enables them to act on and react to other entities (users, objects, and other agents) in their environment through bodily actions, which include the use of verbal and non-verbal communicative behaviors in face-to-face interactions with the user. These agents have been developed for various roles in different application domains, in which they perform tasks that have been assigned to them by their developers or delegated to them by their users or by other agents. In computer-assisted learning, embodied interactive pedagogical software agents have the general task to promote human learning by working with students (and other agents) in computer-based learning environments, among them e-learning platforms based on Internet technologies, such as the Virtual Linguistics Campus (www.linguistics-online.com). In these environments, pedagogical agents provide contextualized, qualified, personalized, and timely assistance, cooperation, instruction, motivation, and services for both individual learners and groups of learners. This thesis develops a comprehensive, multidisciplinary, and user-oriented view of the design of embodied interactive pedagogical software agents, which integrates theoretical and practical insights from various academic and other fields. The research intends to contribute to the scientific understanding of issues, methods, theories, and technologies that are involved in the design, implementation, and evaluation of embodied interactive software agents for different roles in e-learning and other areas. For developers, the thesis provides sixteen basic principles (Added Value, Perceptible Qualities, Balanced Design, Coherence, Consistency, Completeness, Comprehensibility, Individuality, Variability, Communicative Ability, Modularity, Teamwork, Participatory Design, Role Awareness, Cultural Awareness, and Relationship Building) plus a large number of specific guidelines for the design of embodied interactive software agents and their components. Furthermore, it offers critical reviews of theories, concepts, approaches, and technologies from different areas and disciplines that are relevant to agent design. Finally, it discusses three pedagogical agent roles (virtual native speaker, coach, and peer) in the scenario of the linguistic fieldwork classes on the Virtual Linguistics Campus and presents detailed considerations for the design of an agent for one of these roles (the virtual native speaker)

    Processus cérébraux adaptés aux systèmes tutoriels intelligents

    Get PDF
    Le module de l'apprenant est l'une des composantes les plus importantes d’un Système Tutoriel Intelligent (STI). L'extension du modèle de l'apprenant n'a pas cessé de progresser. Malgré la définition d’un profil cognitif et l’intégration d’un profil émotionnel, le module de l’apprenant demeure non exhaustif. Plusieurs senseurs physiologiques sont utilisés pour raffiner la reconnaissance des états cognitif et émotionnel de l’apprenant mais l’emploi simultané de tous ces senseurs l’encombre. De plus, ils ne sont pas toujours adaptés aux apprenants dont les capacités sont réduites. Par ailleurs, la plupart des stratégies pédagogiques exécutées par le module du tuteur ne sont pas conçues à la base d’une collecte dynamique de données en temps réel, cela diminue donc de leur efficacité. L’objectif de notre recherche est d’explorer l’activité électrique cérébrale et de l’utiliser comme un nouveau canal de communication entre le STI et l’apprenant. Pour ce faire nous proposons de concevoir, d’implémenter et d’évaluer le système multi agents NORA. Grâce aux agents de NORA, il est possible d’interpréter et d’influencer l’activité électrique cérébrale de l’apprenant pour un meilleur apprentissage. Ainsi, NORA enrichit le module apprenant d’un profile cérébral et le module tuteur de quelques nouvelles stratégies neuropédagogiques efficaces. L’intégration de NORA à un STI donne naissance à une nouvelle génération de systèmes tutoriels : les STI Cérébro-sensibles (ou STICS) destinés à aider un plus grand nombre d’apprenants à interagir avec l’ordinateur pour apprendre à gérer leurs émotions, maintenir la concentration et maximiser les conditions favorable à l’apprentissage.The learner module is the most important component within an Intelligent Tutoring System (ITS). The extension of the learner module is still in progress, despite the integration of the cognitive profile and the emotional profile, it is not yet exhaustive. To improve the prediction of the learner’s emotional and cognitive states, many physiological sensors have been used, but all of these sensors are cumbersome. In addition, they are not always adapted to the learners with reduced capacities. Beside, most of the pedagogical strategies that are executed by the tutor module are based on no-live collections of data. This fact reduces their efficiency. The objective of our research is to explore the electrical brain activity and use it as a communication channel between a learner and an ITS. To reach this aim, we suggest to conceive, to implement and to evaluate the multi-agent system NORA. Integrated to an ITS, this one became a Brain Sensitive Intelligent Tutoring System (BS-ITS). Agents of NORA interpret the learner’s brain electrical signal and react to it. The new BS-ITS is the extension of an ITS and enrich the learner module with the brain profile and the tutor module with a new Neuropedagogical Strategies. We aim to reach more categories of learners and help them to manage their stress, anxiety and maintain the concentration, the attention and the interest

    The Perception of Emotion from Acoustic Cues in Natural Speech

    Get PDF
    Knowledge of human perception of emotional speech is imperative for the development of emotion in speech recognition systems and emotional speech synthesis. Owing to the fact that there is a growing trend towards research on spontaneous, real-life data, the aim of the present thesis is to examine human perception of emotion in naturalistic speech. Although there are many available emotional speech corpora, most contain simulated expressions. Therefore, there remains a compelling need to obtain naturalistic speech corpora that are appropriate and freely available for research. In that regard, our initial aim was to acquire suitable naturalistic material and examine its emotional content based on listener perceptions. A web-based listening tool was developed to accumulate ratings based on large-scale listening groups. The emotional content present in the speech material was demonstrated by performing perception tests on conveyed levels of Activation and Evaluation. As a result, labels were determined that signified the emotional content, and thus contribute to the construction of a naturalistic emotional speech corpus. In line with the literature, the ratings obtained from the perception tests suggested that Evaluation (or hedonic valence) is not identified as reliably as Activation is. Emotional valence can be conveyed through both semantic and prosodic information, for which the meaning of one may serve to facilitate, modify, or conflict with the meaning of the other—particularly with naturalistic speech. The subsequent experiments aimed to investigate this concept by comparing ratings from perception tests of non-verbal speech with verbal speech. The method used to render non-verbal speech was low-pass filtering, and for this, suitable filtering conditions were determined by carrying out preliminary perception tests. The results suggested that nonverbal naturalistic speech provides sufficiently discernible levels of Activation and Evaluation. It appears that the perception of Activation and Evaluation is affected by low-pass filtering, but that the effect is relatively small. Moreover, the results suggest that there is a similar trend in agreement levels between verbal and non-verbal speech. To date it still remains difficult to determine unique acoustical patterns for hedonic valence of emotion, which may be due to inadequate labels or the incorrect selection of acoustic parameters. This study has implications for the labelling of emotional speech data and the determination of salient acoustic correlates of emotion
    corecore