439 research outputs found
Designing Embodied Interactive Software Agents for E-Learning: Principles, Components, and Roles
Embodied interactive software agents are complex autonomous, adaptive, and social software systems with a digital embodiment that enables them to act on and react to other entities (users, objects, and other agents) in their environment through bodily actions, which include the use of verbal and non-verbal communicative behaviors in face-to-face interactions with the user. These agents have been developed for various roles in different application domains, in which they perform tasks that have been assigned to them by their developers or delegated to them by their users or by other agents. In computer-assisted learning, embodied interactive pedagogical software agents have the general task to promote human learning by working with students (and other agents) in computer-based learning environments, among them e-learning platforms based on Internet technologies, such as the Virtual Linguistics Campus (www.linguistics-online.com). In these environments, pedagogical agents provide contextualized, qualified, personalized, and timely assistance, cooperation, instruction, motivation, and services for both individual learners and groups of learners.
This thesis develops a comprehensive, multidisciplinary, and user-oriented view of the design of embodied interactive pedagogical software agents, which integrates theoretical and practical insights from various academic and other fields. The research intends to contribute to the scientific understanding of issues, methods, theories, and technologies that are involved in the design, implementation, and evaluation of embodied interactive software agents for different roles in e-learning and other areas. For developers, the thesis provides sixteen basic principles (Added Value, Perceptible Qualities, Balanced Design, Coherence, Consistency, Completeness, Comprehensibility, Individuality, Variability, Communicative Ability, Modularity, Teamwork, Participatory Design, Role Awareness, Cultural Awareness, and Relationship Building) plus a large number of specific guidelines for the design of embodied interactive software agents and their components. Furthermore, it offers critical reviews of theories, concepts, approaches, and technologies from different areas and disciplines that are relevant to agent design. Finally, it discusses three pedagogical agent roles (virtual native speaker, coach, and peer) in the scenario of the linguistic fieldwork classes on the Virtual Linguistics Campus and presents detailed considerations for the design of an agent for one of these roles (the virtual native speaker)
Producing Acoustic-Prosodic Entrainment in a Robotic Learning Companion to Build Learner Rapport
abstract: With advances in automatic speech recognition, spoken dialogue systems are assuming increasingly social roles. There is a growing need for these systems to be socially responsive, capable of building rapport with users. In human-human interactions, rapport is critical to patient-doctor communication, conflict resolution, educational interactions, and social engagement. Rapport between people promotes successful collaboration, motivation, and task success. Dialogue systems which can build rapport with their user may produce similar effects, personalizing interactions to create better outcomes.
This dissertation focuses on how dialogue systems can build rapport utilizing acoustic-prosodic entrainment. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic features of speech, such as tone of voice or loudness, to one another over the course of a conversation. Correlated with liking and task success, a dialogue system which entrains may enhance rapport. Entrainment, however, is very challenging to model. People entrain on different features in many ways and how to design entrainment to build rapport is unclear. The first goal of this dissertation is to explore how acoustic-prosodic entrainment can be modeled to build rapport.
Towards this goal, this work presents a series of studies comparing, evaluating, and iterating on the design of entrainment, motivated and informed by human-human dialogue. These models of entrainment are implemented in the dialogue system of a robotic learning companion. Learning companions are educational agents that engage students socially to increase motivation and facilitate learning. As a learning companionâs ability to be socially responsive increases, so do vital learning outcomes. A second goal of this dissertation is to explore the effects of entrainment on concrete outcomes such as learning in interactions with robotic learning companions.
This dissertation results in contributions both technical and theoretical. Technical contributions include a robust and modular dialogue system capable of producing prosodic entrainment and other socially-responsive behavior. One of the first systems of its kind, the results demonstrate that an entraining, social learning companion can positively build rapport and increase learning. This dissertation provides support for exploring phenomena like entrainment to enhance factors such as rapport and learning and provides a platform with which to explore these phenomena in future work.Dissertation/ThesisDoctoral Dissertation Computer Science 201
Exploring Speech Technologies for Language Learning
The teaching of the pronunciation of any foreign language must encompass both segmental and suprasegmental aspects
of speech. In computational terms, the two levels of language learning activities can be decomposed at least into
phonemic aspects, which include the correct pronunciation of single phonemes and the co-articulation of phonemes into
higher phonological units; as well as prosodic aspects which include
ï± the correct position of stress at word level;
ï± the alternation of stress and unstressed syllables in terms of compensation and vowel reduction;
ï± the correct position of sentence accent;
ï± the generation of the adequate rhymth from the interleaving of stress, accent, and phonological rules;
ï± the generation of adequate intonational pattern for each utterance related to communicative functions;
As appears from above, for a student to communicate intelligibly and as close as possible to native-speaker's
pronunciation, prosody is very important [3]. We also assume that an incorrect prosody may hamper communication
from taking place and this may be regarded a strong motivation for having the teaching of Prosody as an integral part of
any language course. From our point of view it is much more important to stress the achievement of successful
communication as the main objective of a second language learner rather than the overcoming of what has been termed
âforeign accentâ, which can be deemed as a secondary goal. In any case, the two goals are certainly not coincident even
though they may be overlapping in some cases. We will discuss about these matter in the following sections.
All prosodic questions related to ârhythmâ will be discussed in the first section of this chapter. In [4] the author argues
in favour of prosodic aids, in particular because a strong placement of word stress may impair understanding from the
listenerâs point of view of the word being pronounced. He also argues in favour of acquiring correct timing of
phonological units to overcome the impression of âforeign accentâ which may ensue from an incorrect distribution of
stressed vs. unstressed stretches of linguistic units such as syllables or metric feet. Timing is not to be confused with
speaking rate which need not be increased forcefully to give the impression of a good fluency: trying to increase
speaking rate may result in lower intelligibility. The question of âforeign accentâ is also discussed at length in (Jilka M.,
1999). This work is particularly relevant as far as intonational features of a learner of a second language which we will
address in the second section of this chapter. Correcting the Intonational Foreign Accent (hence IFA) is an important
component of a Prosodic Module for self-learning activities, as categorical aspects of the intonation of the two languages
in contact, L1 and L2 are far apart and thus neatly distinguishable. Choice of the two languages in contact is determined
mainly by the fact that the distance in prosodic terms between English and Italian is maximal, according to (Ramus, F.
and J. Mehler, 1999; Ramus F., et al., 1999)
Promoting Learning by Inducing and Scaffolding Cognitive Disequilibrium and Confusion through System Feedback
Learners frequently experience uncertainty about how to proceed during learning. These experiences cause learners to enter a state of cognitive disequilibrium and its affiliated affective state of confusion. Cognitive disequilibrium and confusion have been found to frequently occur during complex learning and provide opportunities for deeper learning. In the current thesis, a learning environment that induces confusion was investigated. In the environment, learners engaged in a dialogue on scientific reasoning with an animated pedagogical agent. Confusion was induced through false feedback provided by the tutor agent (e.g., when learners responded correctly and were told their response was incorrect). Self-reports of confusion during the training session indicated that false feedback was an effective method for inducing confusion. False feedback was also found to increase learnersâ ability to apply this knowledge to new and novel situations, under certain conditions. Implications for the design of learning environments are also discussed
Cognitive architecture of multimodal multidimensional dialogue management
Numerous studies show that participants of real-life dialogues happen to get involved in rather dynamic non-sequential interactions. This challenges the dialogue system designs based on a reactive interlocutor paradigm and calls for dialog systems that can be characterised as a proactive learner, accomplished multitasking planner and adaptive decision maker. Addressing this call, the thesis brings innovative integration of cognitive models into the human-computer dialogue systems. This work utilises recent advances in Instance-Based Learning of Theory of Mind skills and the established Cognitive Task Analysis and ACT-R models. Cognitive Task Agents, producing detailed simulation of human learning, prediction, adaption and decision making, are integrated in the multi-agent Dialogue Man-ager. The manager operates on the multidimensional information state enriched with representations based on domain- and modality-specific semantics and performs context-driven dialogue acts interpretation and generation. The flexible technical framework for modular distributed dialogue system integration is designed and tested. The implemented multitasking Interactive Cognitive Tutor is evaluated as showing human-like proactive and adaptive behaviour in setting goals, choosing appropriate strategies and monitoring processes across contexts, and encouraging the user exhibit similar metacognitive competences
A Satisfaction-based Model for Affect Recognition from Conversational Features in Spoken Dialog Systems
Detecting user affect automatically during real-time conversation is the main challenge towards our greater aim of infusing social intelligence into a natural-language mixed-initiative High-Fidelity (Hi-Fi) audio control spoken dialog agent. In recent years, studies on affect detection from voice have moved on to using realistic, non-acted data, which is subtler. However, it is more challenging to perceive subtler emotions and this is demonstrated in tasks such as labelling and machine prediction. This paper attempts to address part of this challenge by considering the role of user satisfaction ratings and also conversational/dialog features in discriminating contentment and frustration, two types of emotions that are known to be prevalent within spoken human-computer interaction. However, given the laboratory constraints, users might be positively biased when rating the system, indirectly making the reliability of the satisfaction data questionable. Machine learning experiments were conducted on two datasets, users and annotators, which were then compared in order to assess the reliability of these datasets. Our results indicated that standard classifiers were significantly more successful in discriminating the abovementioned emotions and their intensities (reflected by user satisfaction ratings) from annotator data than from user data. These results corroborated that: first, satisfaction data could be used directly as an alternative target variable to model affect, and that they could be predicted exclusively by dialog features. Second, these were only true when trying to predict the abovementioned emotions using annotator?s data, suggesting that user bias does exist in a laboratory-led evaluation
ModĂ©lisation des Ă©motions de lâapprenant et interventions implicites pour les systĂšmes tutoriels intelligents
La modĂ©lisation de lâexpĂ©rience de lâutilisateur dans les Interactions Homme-Machine est un enjeu important pour la conception et le dĂ©veloppement des systĂšmes adaptatifs intelligents. Dans ce contexte, une attention particuliĂšre est portĂ©e sur les rĂ©actions Ă©motionnelles de lâutilisateur, car elles ont une influence capitale sur ses aptitudes cognitives, comme la perception et la prise de dĂ©cision. La modĂ©lisation des Ă©motions est particuliĂšrement pertinente pour les SystĂšmes Tutoriels Ămotionnellement Intelligents (STEI). Ces systĂšmes cherchent Ă identifier les Ă©motions de lâapprenant lors des sessions dâapprentissage, et Ă optimiser son expĂ©rience dâinteraction en recourant Ă diverses stratĂ©gies dâinterventions.
Cette thĂšse vise Ă amĂ©liorer les mĂ©thodes de modĂ©lisation des Ă©motions et les stratĂ©gies Ă©motionnelles utilisĂ©es actuellement par les STEI pour agir sur les Ă©motions de lâapprenant. Plus prĂ©cisĂ©ment, notre premier objectif a Ă©tĂ© de proposer une nouvelle mĂ©thode pour dĂ©tecter lâĂ©tat Ă©motionnel de lâapprenant, en utilisant diffĂ©rentes sources dâinformations qui permettent de mesurer les Ă©motions de façon prĂ©cise, tout en tenant compte des variables individuelles qui peuvent avoir un impact sur la manifestation des Ă©motions. Pour ce faire, nous avons dĂ©veloppĂ© une approche multimodale combinant plusieurs mesures physiologiques (activitĂ© cĂ©rĂ©brale, rĂ©actions galvaniques et rythme cardiaque) avec des variables individuelles, pour dĂ©tecter une Ă©motion trĂšs frĂ©quemment observĂ©e lors des sessions dâapprentissage, Ă savoir lâincertitude. Dans un premier lieu, nous avons identifiĂ© les indicateurs physiologiques clĂ©s qui sont associĂ©s Ă cet Ă©tat, ainsi que les caractĂ©ristiques individuelles qui contribuent Ă sa manifestation. Puis, nous avons dĂ©veloppĂ© des modĂšles prĂ©dictifs permettant de dĂ©tecter automatiquement cet Ă©tat Ă partir des diffĂ©rentes variables analysĂ©es, Ă travers lâentrainement dâalgorithmes dâapprentissage machine.
Notre deuxiĂšme objectif a Ă©tĂ© de proposer une approche unifiĂ©e pour reconnaĂźtre simultanĂ©ment une combinaison de plusieurs Ă©motions, et Ă©valuer explicitement lâimpact de ces Ă©motions sur lâexpĂ©rience dâinteraction de lâapprenant. Pour cela, nous avons dĂ©veloppĂ© une plateforme hiĂ©rarchique, probabiliste et dynamique permettant de suivre les changements Ă©motionnels de l'apprenant au fil du temps, et dâinfĂ©rer automatiquement la tendance gĂ©nĂ©rale qui caractĂ©rise son expĂ©rience dâinteraction Ă savoir : lâimmersion, le blocage ou le dĂ©crochage. Lâimmersion correspond Ă une expĂ©rience optimale : un Ă©tat dans lequel l'apprenant est complĂštement concentrĂ© et impliquĂ© dans lâactivitĂ© dâapprentissage. LâĂ©tat de blocage correspond Ă une tendance dâinteraction non optimale oĂč l'apprenant a de la difficultĂ© Ă se concentrer. Finalement, le dĂ©crochage correspond Ă un Ă©tat extrĂȘmement dĂ©favorable oĂč lâapprenant nâest plus du tout impliquĂ© dans lâactivitĂ© dâapprentissage. La plateforme proposĂ©e intĂšgre trois modalitĂ©s de variables diagnostiques permettant dâĂ©valuer lâexpĂ©rience de lâapprenant Ă savoir : des variables physiologiques, des variables comportementales, et des mesures de performance, en combinaison avec des variables prĂ©dictives qui reprĂ©sentent le contexte courant de lâinteraction et les caractĂ©ristiques personnelles de l'apprenant. Une Ă©tude a Ă©tĂ© rĂ©alisĂ©e pour valider notre approche Ă travers un protocole expĂ©rimental permettant de provoquer dĂ©libĂ©rĂ©ment les trois tendances ciblĂ©es durant lâinteraction des apprenants avec diffĂ©rents environnements dâapprentissage.
Enfin, notre troisiĂšme objectif a Ă©tĂ© de proposer de nouvelles stratĂ©gies pour influencer positivement lâĂ©tat Ă©motionnel de lâapprenant, sans interrompre la dynamique de la session dâapprentissage. Nous avons Ă cette fin introduit le concept de stratĂ©gies Ă©motionnelles implicites : une nouvelle approche pour agir subtilement sur les Ă©motions de lâapprenant, dans le but dâamĂ©liorer son expĂ©rience dâapprentissage. Ces stratĂ©gies utilisent la perception subliminale, et plus prĂ©cisĂ©ment une technique connue sous le nom dâamorçage affectif. Cette technique permet de solliciter inconsciemment les Ă©motions de lâapprenant, Ă travers la projection dâamorces comportant certaines connotations affectives. Nous avons mis en Ćuvre une stratĂ©gie Ă©motionnelle implicite utilisant une forme particuliĂšre dâamorçage affectif Ă savoir : le conditionnement Ă©valuatif, qui est destinĂ© Ă amĂ©liorer de façon inconsciente lâestime de soi. Une Ă©tude expĂ©rimentale a Ă©tĂ© rĂ©alisĂ©e afin dâĂ©valuer lâimpact de cette stratĂ©gie sur les rĂ©actions Ă©motionnelles et les performances des apprenants.Modeling the userâs experience within Human-Computer Interaction is an important challenge for the design and development of intelligent adaptive systems. In this context, a particular attention is given to the userâs emotional reactions, as they decisively influence his cognitive abilities, such as perception and decision-making. Emotion modeling is particularly relevant for Emotionally Intelligent Tutoring Systems (EITS). These systems seek to identify the learnerâs emotions during tutoring sessions, and to optimize his interaction experience using a variety of intervention strategies.
This thesis aims to improve current methods on emotion modeling, as well as the emotional strategies that are presently used within EITS to influence the learnerâs emotions. More precisely, our first objective was to propose a new method to recognize the learnerâs emotional state, using different sources of information that allow to measure emotions accurately, whilst taking account of individual characteristics that can have an impact on the manifestation of emotions. To that end, we have developed a multimodal approach combining several physiological measures (brain activity, galvanic responses and heart rate) with individual variables, to detect a specific emotion, which is frequently observed within computer tutoring, namely : uncertainty. First, we have identified the key physiological indicators that are associated to this state, and the individual characteristics that contribute to its manifestation. Then, we have developed predictive models to automatically detect this state from the analyzed variables, trough machine learning algorithm training.
Our second objective was to propose a unified approach to simultaneously recognize a combination of several emotions, and to explicitly evaluate the impact of these emotions on the learnerâs interaction experience. For this purpose, we have developed a hierarchical, probabilistic and dynamic framework, which allows one to track the learnerâs emotional changes over time, and to automatically infer the trend that characterizes his interaction experience namely : flow, stuck or off-task. Flow is an optimal experience : a state in which the learner is completely focused and involved within the learning activity. The state of stuck is a non-optimal trend of the interaction where the learner has difficulty to maintain focused attention. Finally, the off-task behavior is an extremely unfavorable state where the learner is not involved anymore within the learning session. The proposed framework integrates three-modality diagnostic variables that sense the learnerâs experience including : physiology, behavior and performance, in conjunction with predictive variables that represent the current context of the interaction and the learnerâs personal characteristics. A human-subject study was conducted to validate our approach through an experimental protocol designed to deliberately elicit the three targeted trends during the learnersâ interaction with different learning environments.
Finally, our third objective was to propose new strategies to positively influence the learnerâs emotional state, without interrupting the dynamics of the learning session. To this end, we have introduced the concept of implicit emotional strategies : a novel approach to subtly impact the learnerâs emotions, in order to improve his learning experience. These strategies use the subliminal perception, and more precisely a technique known as affective priming. This technique aims to unconsciously solicit the learnerâs emotions, through the projection of primes charged with specific affective connotations. We have implemented an implicit emotional strategy using a particular form of affective priming namely : the evaluative conditioning, which is designed to unconsciously enhance self-esteem. An experimental study was conducted in order to evaluate the impact of this strategy on the learnersâ emotional reactions and performance
- âŠ