381 research outputs found

    Predicting the confusion level of text excerpts with syntactic, lexical and n-gram features

    Get PDF
    Distance learning, offline presentations (presentations that are not being carried in a live fashion but were instead pre-recorded) and such activities whose main goal is to convey information are getting increasingly relevant with digital media such as Virtual Reality (VR) and Massive Online Open Courses (MOOCs). While MOOCs are a well-established reality in the learning environment, VR is also being used to promote learning in virtual rooms, be it in the academia or in the industry. Oftentimes these methods are based on written scripts that take the learner through the content, making them critical components to these tools. With such an important role, it is important to ensure the efficiency of these scripts. Confusion is a non-basic emotion associated with learning. This process often leads to a cognitive disequilibrium either caused by the content itself or due to the way it is conveyed when it comes to its syntactic and lexical features. We hereby propose a supervised model that can predict the likelihood of confusion an input text excerpt can cause on the learner. To achieve this, we performed syntactic and lexical analyses over 300 text excerpts and collected 5 confusion level classifications (0 – 6) per excerpt from 51 annotators to use their respective means as labels. These examples that compose the dataset were collected from random presentations transcripts across various fields of knowledge. The learning model was trained with this data with the results being included in the body of the paper. This model allows the design of clearer scripts of offline presentations and similar approaches and we expect that it improves the efficiency of these speeches. While this model is applied to this specific case, we hope to pave the way to generalize this approach to other contexts where clearness of text is critical, such as the scripts of MOOCs or academic abstracts.info:eu-repo/semantics/acceptedVersio

    Affective learning: improving engagement and enhancing learning with affect-aware feedback

    Get PDF
    This paper describes the design and ecologically valid evaluation of a learner model that lies at the heart of an intelligent learning environment called iTalk2Learn. A core objective of the learner model is to adapt formative feedback based on students’ affective states. Types of adaptation include what type of formative feedback should be provided and how it should be presented. Two Bayesian networks trained with data gathered in a series of Wizard-of-Oz studies are used for the adaptation process. This paper reports results from a quasi-experimental evaluation, in authentic classroom settings, which compared a version of iTalk2Learn that adapted feedback based on students’ affective states as they were talking aloud with the system (the affect condition) with one that provided feedback based only on the students’ performance (the non-affect condition). Our results suggest that affect-aware support contributes to reducing boredom and off-task behavior, and may have an effect on learning. We discuss the internal and ecological validity of the study, in light of pedagogical considerations that informed the design of the two conditions. Overall, the results of the study have implications both for the design of educational technology and for classroom approaches to teaching, because they highlight the important role that affect-aware modelling plays in the adaptive delivery of formative feedback to support learning

    Towards higher sense of presence: a 3D virtual environment adaptable to confusion and engagement

    Get PDF
    Virtual Reality scenarios where emitters convey information to receptors can be used as a tool for distance learning and to enable virtual visits to company physical headquarters. However, immersive Virtual Reality setups usually require visualization interfaces such as Head-mounted Displays, Powerwalls or CAVE systems, supported by interaction devices (Microsoft Kinect, Wii Motion, among others), that foster natural interaction but are often inaccessible to users. We propose a virtual presentation scenario, supported by a framework, that provides emotion-driven interaction through ubiquitous devices. An experiment with 3 conditions was designed involving: a control condition; a less confusing text script based on its lexical, syntactical, and bigram features; and a third condition where an adaptive lighting system dynamically acted based on the user’s engagement. Results show that users exposed to the less confusing script reported higher sense of presence, albeit without statistical significance. Users from the last condition reported lower sense of presence, which rejects our hypothesis without statistical significance. We theorize that, as the presentation was given orally and the adaptive lighting system impacts the visual channel, this conflict may have overloaded the users’ cognitive capacity and thus reduced available resources to address the presentation content.info:eu-repo/semantics/publishedVersio

    Virtual environments promoting interaction

    Get PDF
    Virtual reality (VR) has been widely researched in the academic environment and is now breaking into the industry. Regular companies do not have access to this technology as a collaboration tool because these solutions usually require specific devices that are not at hand of the common user in offices. There are other collaboration platforms based on video, speech and text, but VR allows users to share the same 3D space. In this 3D space there can be added functionalities or information that in a real-world environment would not be possible, something intrinsic to VR. This dissertation has produced a 3D framework that promotes nonverbal communication. It plays a fundamental role on human interaction and is mostly based on emotion. In the academia, confusion is known to influence learning gains if it is properly managed. We designed a study to evaluate how lexical, syntactic and n-gram features influence perceived confusion and found results (not statistically significant) that point that it is possible to build a machine learning model that can predict the level of confusion based on these features. This model was used to manipulate the script of a given presentation, and user feedback shows a trend that by manipulating these features and theoretically lowering the level of confusion on text not only drops the reported confusion, as it also increases reported sense of presence. Another contribution of this dissertation comes from the intrinsic features of a 3D environment where one can carry actions that in a real world are not possible. We designed an automatic adaption lighting system that reacts to the perceived user’s engagement. This hypothesis was partially refused as the results go against what we hypothesized but do not have statistical significance. Three lines of research may stem from this dissertation. First, there can be more complex features to train the machine learning model such as syntax trees. Also, on an Intelligent Tutoring System this could adjust the avatar’s speech in real-time if fed by a real-time confusion detector. When going for a social scenario, the set of basic emotions is well-adjusted and can enrich them. Facial emotion recognition can extend this effect to the avatar’s body to fuel this synchronization and increase the sense of presence. Finally, we based this dissertation on the premise of using ubiquitous devices, but with the rapid evolution of technology we should consider that new devices will be present on offices. This opens new possibilities for other modalities.A Realidade Virtual (RV) tem sido alvo de investigação extensa na academia e tem vindo a entrar na indústria. Empresas comuns não têm acesso a esta tecnologia como uma ferramenta de colaboração porque estas soluções necessitam de dispositivos específicos que não estão disponíveis para o utilizador comum em escritório. Existem outras plataformas de colaboração baseadas em vídeo, voz e texto, mas a RV permite partilhar o mesmo espaço 3D. Neste espaço podem existir funcionalidades ou informação adicionais que no mundo real não seria possível, algo intrínseco à RV. Esta dissertação produziu uma framework 3D que promove a comunicação não-verbal que tem um papel fundamental na interação humana e é principalmente baseada em emoção. Na academia é sabido que a confusão influencia os ganhos na aprendizagem quando gerida adequadamente. Desenhámos um estudo para avaliar como as características lexicais, sintáticas e n-gramas influenciam a confusão percecionada. Construímos e testámos um modelo de aprendizagem automática que prevê o nível de confusão baseado nestas características, produzindo resultados não estatisticamente significativos que suportam esta hipótese. Este modelo foi usado para manipular o texto de uma apresentação e o feedback dos utilizadores demonstra uma tendência na diminuição do nível de confusão reportada no texto e aumento da sensação de presença. Outra contribuição vem das características intrínsecas de um ambiente 3D onde se podem executar ações que no mundo real não seriam possíveis. Desenhámos um sistema automático de iluminação adaptativa que reage ao engagement percecionado do utilizador. Os resultados não suportam o que hipotetizámos mas não têm significância estatística, pelo que esta hipótese foi parcialmente rejeitada. Três linhas de investigação podem provir desta dissertação. Primeiro, criar características mais complexas para treinar o modelo de aprendizagem, tais como árvores de sintaxe. Além disso, num Intelligent Tutoring System este modelo poderá ajustar o discurso do avatar em tempo real, alimentado por um detetor de confusão. As emoções básicas ajustam-se a um cenário social e podem enriquecê-lo. A emoção expressada facialmente pode estender este efeito ao corpo do avatar para alimentar o sincronismo social e aumentar a sensação de presença. Finalmente, baseámo-nos em dispositivos ubíquos, mas com a rápida evolução da tecnologia, podemos considerar que novos dispositivos irão estar presentes em escritórios. Isto abre possibilidades para novas modalidades

    Recognising Complex Mental States from Naturalistic Human-Computer Interactions

    Get PDF
    New advances in computer vision techniques will revolutionize the way we interact with computers, as they, together with other improvements, will help us build machines that understand us better. The face is the main non-verbal channel for human-human communication and contains valuable information about emotion, mood, and mental state. Affective computing researchers have investigated widely how facial expressions can be used for automatically recognizing affect and mental states. Nowadays, physiological signals can be measured by video-based techniques, which can also be utilised for emotion detection. Physiological signals, are an important indicator of internal feelings, and are more robust against social masking. This thesis focuses on computer vision techniques to detect facial expression and physiological changes for recognizing non-basic and natural emotions during human-computer interaction. It covers all stages of the research process from data acquisition, integration and application. Most previous studies focused on acquiring data from prototypic basic emotions acted out under laboratory conditions. To evaluate the proposed method under more practical conditions, two different scenarios were used for data collection. In the first scenario, a set of controlled stimulus was used to trigger the user’s emotion. The second scenario aimed at capturing more naturalistic emotions that might occur during a writing activity. In the second scenario, the engagement level of the participants with other affective states was the target of the system. For the first time this thesis explores how video-based physiological measures can be used in affect detection. Video-based measuring of physiological signals is a new technique that needs more improvement to be used in practical applications. A machine learning approach is proposed and evaluated to improve the accuracy of heart rate (HR) measurement using an ordinary camera during a naturalistic interaction with computer

    Recognising Complex Mental States from Naturalistic Human-Computer Interactions

    Get PDF
    New advances in computer vision techniques will revolutionize the way we interact with computers, as they, together with other improvements, will help us build machines that understand us better. The face is the main non-verbal channel for human-human communication and contains valuable information about emotion, mood, and mental state. Affective computing researchers have investigated widely how facial expressions can be used for automatically recognizing affect and mental states. Nowadays, physiological signals can be measured by video-based techniques, which can also be utilised for emotion detection. Physiological signals, are an important indicator of internal feelings, and are more robust against social masking. This thesis focuses on computer vision techniques to detect facial expression and physiological changes for recognizing non-basic and natural emotions during human-computer interaction. It covers all stages of the research process from data acquisition, integration and application. Most previous studies focused on acquiring data from prototypic basic emotions acted out under laboratory conditions. To evaluate the proposed method under more practical conditions, two different scenarios were used for data collection. In the first scenario, a set of controlled stimulus was used to trigger the user’s emotion. The second scenario aimed at capturing more naturalistic emotions that might occur during a writing activity. In the second scenario, the engagement level of the participants with other affective states was the target of the system. For the first time this thesis explores how video-based physiological measures can be used in affect detection. Video-based measuring of physiological signals is a new technique that needs more improvement to be used in practical applications. A machine learning approach is proposed and evaluated to improve the accuracy of heart rate (HR) measurement using an ordinary camera during a naturalistic interaction with computer

    Supporting Children’s Metacognition with a Facial Emotion Recognition based Intelligent Tutor System

    Get PDF
    The present study aims to investigate the relationship between emotions experienced during learning and metacognition in typically developing (TD) children and those with autism spectrum disorder (ASD). This will assist us in using machine learning (ML) to develop a facial emotion recognition (FER) based intelligent tutor system (ITS) to support children’s metacognitive monitoring process in order to enhance their learning outcomes. In this paper, we first report the results of our preliminary research, which utilized an ML-based FER algorithm to detect four spontaneous epistemic emotions (i.e., neutral, confused, frustrated, and boredom) and six spontaneous basic emotions (i.e., anger, disgust, fear, happiness, sadness, and surprise). Subsequently, we adapted an application (‘BrainHood’) to create the ‘Meta-BrainHood’, that embedded our proposed ML-based FER algorithm to examine the relationship between facial emotion expressions and metacognitive monitoring performance in TD children and those with ASD. Finally, we outline the future steps in our research, which adopts the outcomes of the first two steps to construct an ITS to improve children’s metacognitive monitoring performance and learning outcomes.Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Human Information Communication Desig

    Automatic Context-Driven Inference of Engagement in HMI: A Survey

    Full text link
    An integral part of seamless human-human communication is engagement, the process by which two or more participants establish, maintain, and end their perceived connection. Therefore, to develop successful human-centered human-machine interaction applications, automatic engagement inference is one of the tasks required to achieve engaging interactions between humans and machines, and to make machines attuned to their users, hence enhancing user satisfaction and technology acceptance. Several factors contribute to engagement state inference, which include the interaction context and interactants' behaviours and identity. Indeed, engagement is a multi-faceted and multi-modal construct that requires high accuracy in the analysis and interpretation of contextual, verbal and non-verbal cues. Thus, the development of an automated and intelligent system that accomplishes this task has been proven to be challenging so far. This paper presents a comprehensive survey on previous work in engagement inference for human-machine interaction, entailing interdisciplinary definition, engagement components and factors, publicly available datasets, ground truth assessment, and most commonly used features and methods, serving as a guide for the development of future human-machine interaction interfaces with reliable context-aware engagement inference capability. An in-depth review across embodied and disembodied interaction modes, and an emphasis on the interaction context of which engagement perception modules are integrated sets apart the presented survey from existing surveys
    • …
    corecore