390 research outputs found

    A review on data fusion in multimodal learning analytics and educational data mining

    Get PDF
    The new educational models such as smart learning environments use of digital and context-aware devices to facilitate the learning process. In this new educational scenario, a huge quantity of multimodal students' data from a variety of different sources can be captured, fused, and analyze. It offers to researchers and educators a unique opportunity of being able to discover new knowledge to better understand the learning process and to intervene if necessary. However, it is necessary to apply correctly data fusion approaches and techniques in order to combine various sources of multimodal learning analytics (MLA). These sources or modalities in MLA include audio, video, electrodermal activity data, eye-tracking, user logs, and click-stream data, but also learning artifacts and more natural human signals such as gestures, gaze, speech, or writing. This survey introduces data fusion in learning analytics (LA) and educational data mining (EDM) and how these data fusion techniques have been applied in smart learning. It shows the current state of the art by reviewing the main publications, the main type of fused educational data, and the data fusion approaches and techniques used in EDM/LA, as well as the main open problems, trends, and challenges in this specific research area

    Measuring attention using Microsoft Kinect

    Get PDF
    The transfer of knowledge between individuals has increasingly become achieved with the aid of interfaces or computerized training applications. However, computer based training currently lacks the ability to monitor human behavioral changes and respond to them accordingly. This study examines the ability to predict user attention using features of body posture and head pose. Predictive abilities are assessed by an analysis of the relationship between the measured posture features and common objective measures of attention, such as reaction time and reaction time variance. Subjects were asked to participate in a series of sustained attention tasks while aspects of body movement and positioning were recorded using a Microsoft Kinect. Results showed support for identifiable patterns of behavior associated with attention while also suggesting the complex inter-relationship of measured features and susceptibility of these features to environmental conditions

    Using emotional and non-emotional measures

    Get PDF
    Elbawab, M., & Henriques, R. (2023). Machine Learning applied to student attentiveness detection: Using emotional and non-emotional measures. Education and Information Technologies, 1-21. https://doi.org/10.1007/s10639-023-11814-5 --- Open access funding provided by FCT|FCCN (b-on). This work was supported by national funds through FCT (Fundação para a Ciência e a Tecnologia), under the project—UIDB/04152/2020—Centro de Investigação em Gestão de Informação (MagIC)/NOVA IMS. Fundação para a Ciência e a Tecnologia,UIDB/04152/2020—Centro de Investigação em Gestão de Informação (MagIC)/NOVA IMS, Roberto Henriques.Electronic learning (e-learning) is considered the new norm of learning. One of the significant drawbacks of e-learning in comparison to the traditional classroom is that teachers cannot monitor the students' attentiveness. Previous literature used physical facial features or emotional states in detecting attentiveness. Other studies proposed combining physical and emotional facial features; however, a mixed model that only used a webcam was not tested. The study objective is to develop a machine learning (ML) model that automatically estimates students' attentiveness during e-learning classes using only a webcam. The model would help in evaluating teaching methods for e-learning. This study collected videos from seven students. The webcam of personal computers is used to obtain a video, from which we build a feature set that characterizes a student's physical and emotional state based on their face. This characterization includes eye aspect ratio (EAR), Yawn aspect ratio (YAR), head pose, and emotional states. A total of eleven variables are used in the training and validation of the model. ML algorithms are used to estimate individual students' attention levels. The ML models tested are decision trees, random forests, support vector machines (SVM), and extreme gradient boosting (XGBoost). Human observers' estimation of attention level is used as a reference. Our best attention classifier is the XGBoost, which achieved an average accuracy of 80.52%, with an AUROC OVR of 92.12%. The results indicate that a combination of emotional and non-emotional measures can generate a classifier with an accuracy comparable to other attentiveness studies. The study would also help assess the e-learning lectures through students' attentiveness. Hence will assist in developing the e-learning lectures by generating an attentiveness report for the tested lecture.publishersversionepub_ahead_of_prin

    Facial Expressions of Sentence Comprehension

    Get PDF
    International audienceUnderstanding facial expressions allows access to one's intentional and affective states. Using the findings in psychology and neuroscience, in which physical behaviors of the face are linked to emotional states, this paper aims to study sentence comprehension shown by facial expressions. In our experiments, participants took part in a roughly 30-minute computer mediated task, where they were asked to answer either "true" or "false" to knowledge-based questions, then immediately given feedback of "correct" or "incorrect". Their faces, which were recorded during the task using the Kinect v2 device, are later used to identify the level of comprehension shown by their expressions. To achieve this, the SVM and Random Forest classifiers with facial appearance information extracted using a spatiotemporal local descriptor, named LPQ-TOP, are employed. Results of online sentence comprehension show that facial dynamics are promising to help understand cognitive states of the mind

    MATT: Multimodal Attention Level Estimation for e-learning Platforms

    Full text link
    This work presents a new multimodal system for remote attention level estimation based on multimodal face analysis. Our multimodal approach uses different parameters and signals obtained from the behavior and physiological processes that have been related to modeling cognitive load such as faces gestures (e.g., blink rate, facial actions units) and user actions (e.g., head pose, distance to the camera). The multimodal system uses the following modules based on Convolutional Neural Networks (CNNs): Eye blink detection, head pose estimation, facial landmark detection, and facial expression features. First, we individually evaluate the proposed modules in the task of estimating the student's attention level captured during online e-learning sessions. For that we trained binary classifiers (high or low attention) based on Support Vector Machines (SVM) for each module. Secondly, we find out to what extent multimodal score level fusion improves the attention level estimation. The mEBAL database is used in the experimental framework, a public multi-modal database for attention level estimation obtained in an e-learning environment that contains data from 38 users while conducting several e-learning tasks of variable difficulty (creating changes in student cognitive loads).Comment: Preprint of the paper presented to the Workshop on Artificial Intelligence for Education (AI4EDU) of AAAI 202

    Unobtrusive Assessment Of Student Engagement Levels In Online Classroom Environment Using Emotion Analysis

    Get PDF
    Measuring student engagement has emerged as a significant factor in the process of learning and a good indicator of the knowledge retention capacity of the student. As synchronous online classes have become more prevalent in recent years, gauging a student\u27s attention level is more critical in validating the progress of every student in an online classroom environment. This paper details the study on profiling the student attentiveness to different gradients of engagement level using multiple machine learning models. Results from the high accuracy model and the confidence score obtained from the cloud-based computer vision platform - Amazon Rekognition were then used to statistically validate any correlation between student attentiveness and emotions. This statistical analysis helps to identify the significant emotions that are essential in gauging various engagement levels. This study identified emotions like calm, happy, surprise, and fear are critical in gauging the student\u27s attention level. These findings help in the earlier detection of students with lower attention levels, consequently helping the instructors focus their support and guidance on the students in need, leading to a better online learning environment

    The Multimodal Tutor: Adaptive Feedback from Multimodal Experiences

    Get PDF
    This doctoral thesis describes the journey of ideation, prototyping and empirical testing of the Multimodal Tutor, a system designed for providing digital feedback that supports psychomotor skills acquisition using learning and multimodal data capturing. The feedback is given in real-time with machine-driven assessment of the learner's task execution. The predictions are tailored by supervised machine learning models trained with human annotated samples. The main contributions of this thesis are: a literature survey on multimodal data for learning, a conceptual model (the Multimodal Learning Analytics Model), a technological framework (the Multimodal Pipeline), a data annotation tool (the Visual Inspection Tool) and a case study in Cardiopulmonary Resuscitation training (CPR Tutor). The CPR Tutor generates real-time, adaptive feedback using kinematic and myographic data and neural networks

    Recognizing Multidimensional Engagement of E-learners Based on Multi-channel Data in E-learning Environment

    Get PDF
    Despite recent advances in MOOC, the current e-learning systems have advantages of alleviating barriers by time differences, and geographically spatial separation between teachers and students. However, there has been a 'lack of supervision' problem that e-learner's learning unit state(LUS) can't be supervised automatically. In this paper, we present a fusion framework considering three channel data sources: 1) videos/images from a camera, 2) eye movement information tracked by a low solution eye tracker and 3) mouse movement. Based on these data modalities, we propose a novel approach of multi-channel data fusion to explore the learning unit state recognition. We also propose a method to build a learning state recognition model to avoid manually labeling image data. The experiments were carried on our designed online learning prototype system, and we choose CART, Random Forest and GBDT regression model to predict e-learner's learning state. The results show that multi-channel data fusion model have a better recognition performance in comparison with single channel model. In addition, a best recognition performance can be reached when image, eye movement and mouse movement features are fused.Comment: 4 pages, 4 figures, 2 table

    Modelling collaborative problem-solving competence with transparent learning analytics: is video data enough?

    Get PDF
    In this study, we describe the results of our research to model collaborative problem-solving (CPS) competence based on analytics generated from video data. We have collected ~500 mins video data from 15 groups of 3 students working to solve design problems collaboratively. Initially, with the help of OpenPose, we automatically generated frequency metrics such as the number of the face-in-the-screen; and distance metrics such as the distance between bodies. Based on these metrics, we built decision trees to predict students' listening, watching, making, and speaking behaviours as well as predicting the students' CPS competence. Our results provide useful decision rules mined from analytics of video data which can be used to inform teacher dashboards. Although, the accuracy and recall values of the models built are inferior to previous machine learning work that utilizes multimodal data, the transparent nature of the decision trees provides opportunities for explainable analytics for teachers and learners. This can lead to more agency of teachers and learners, therefore can lead to easier adoption. We conclude the paper with a discussion on the value and limitations of our approach
    corecore