3,793 research outputs found

    Application of Computer Vision and Mobile Systems in Education: A Systematic Review

    Get PDF
    The computer vision industry has experienced a significant surge in growth, resulting in numerous promising breakthroughs in computer intelligence. The present review paper outlines the advantages and potential future implications of utilizing this technology in education. A total of 84 research publications have been thoroughly scrutinized and analyzed. The study revealed that computer vision technology integrated with a mobile application is exceptionally useful in monitoring students’ perceptions and mitigating academic dishonesty. Additionally, it facilitates the digitization of handwritten scripts for plagiarism detection and automates attendance tracking to optimize valuable classroom time. Furthermore, several potential applications of computer vision technology for educational institutions have been proposed to enhance students’ learning processes in various faculties, such as engineering, medical science, and others. Moreover, the technology can also aid in creating a safer campus environment by automatically detecting abnormal activities such as ragging, bullying, and harassment

    Tutor In-sight: Guiding and Visualizing Students Attention with Mixed Reality Avatar Presentation Tools

    Get PDF
    Remote conferencing systems are increasingly used to supplement or even replace in-person teaching. However, prevailing conferencing systems restrict the teacher’s representation to a webcam live-stream, hamper the teacher’s use of body-language, and result in students’ decreased sense of co-presence and participation. While Virtual Reality (VR) systems may increase student engagement, the teacher may not have the time or expertise to conduct the lecture in VR. To address this issue and bridge the requirements between students and teachers, we have developed Tutor In-sight, a Mixed Reality (MR) avatar augmented into the student’s workspace based on four design requirements derived from the existing literature, namely: integrated virtual with physical space, improved teacher’s co-presence through avatar, direct attention with auto-generated body language, and usable workfow for teachers. Two user studies were conducted from the perspectives of students and teachers to determine the advantages of Tutor In-sight in comparison to two existing conferencing systems, Zoom (video-based) and Mozilla Hubs (VR-based). The participants of both studies favoured Tutor In-sight. Among others, this main fnding indicates that Tutor Insight satisfed the needs of both teachers and students. In addition, the participants’ feedback was used to empirically determine the four main teacher requirements and the four main student requirements in order to improve the future design of MR educational tools

    A framework for realistic 3D tele-immersion

    Get PDF
    Meeting, socializing and conversing online with a group of people using teleconferencing systems is still quite differ- ent from the experience of meeting face to face. We are abruptly aware that we are online and that the people we are engaging with are not in close proximity. Analogous to how talking on the telephone does not replicate the experi- ence of talking in person. Several causes for these differences have been identified and we propose inspiring and innova- tive solutions to these hurdles in attempt to provide a more realistic, believable and engaging online conversational expe- rience. We present the distributed and scalable framework REVERIE that provides a balanced mix of these solutions. Applications build on top of the REVERIE framework will be able to provide interactive, immersive, photo-realistic ex- periences to a multitude of users that for them will feel much more similar to having face to face meetings than the expe- rience offered by conventional teleconferencing systems

    An interval type-2 fuzzy logic based system for improved instruction within intelligent e-learning platforms

    Get PDF
    E-learning is becoming increasingly more popular. However, for such platforms (where the students and tutors are geographically separated), it is necessary to estimate the degree of students' engagement with the course contents. Such feedback is highly important and useful for assessing the teaching quality and adjusting the teaching delivery in large-scale online learning platforms. When the number of attendees is large, it is essential to obtain overall engagement feedback, but it is also challenging to do so because of the high levels of uncertainty associated with the environments and students. To handle such uncertainties, we present a type-2 fuzzy logic based system using visual RGB-D features including head pose direction and facial expressions captured from a low-cost but robust 3D camera (Kinect v2) to estimate the engagement degree of the students for both remote and on-site education. This system enriches another self- learning type-2 fuzzy logic system which provides the instructors with suggestions to vary their teaching means to suit the level of course students and improve the course instruction and delivery. This proposed dynamic e-learning environment involves on-site students, distance students, and a teacher who delivers the lecture to all attending onsite and remote students. The rules are learned from the students' behavior and the system is continuously updated to give the teacher the ability to adapt the lecture delivery instructional approach to varied learners' engagement levels. The efficiency of the proposed system has been evaluated through various real-world experiments in the University of Essex iClassroom on a sample of thirty students and six teachers. These experiments demonstrate the efficiency of the proposed interval type-2 fuzzy logic based system to handle the faced uncertainties and produce superior improved average learners' engagements when compared to type-1 fuzzy systems and nonadaptive systems

    MATT: Multimodal Attention Level Estimation for e-learning Platforms

    Full text link
    This work presents a new multimodal system for remote attention level estimation based on multimodal face analysis. Our multimodal approach uses different parameters and signals obtained from the behavior and physiological processes that have been related to modeling cognitive load such as faces gestures (e.g., blink rate, facial actions units) and user actions (e.g., head pose, distance to the camera). The multimodal system uses the following modules based on Convolutional Neural Networks (CNNs): Eye blink detection, head pose estimation, facial landmark detection, and facial expression features. First, we individually evaluate the proposed modules in the task of estimating the student's attention level captured during online e-learning sessions. For that we trained binary classifiers (high or low attention) based on Support Vector Machines (SVM) for each module. Secondly, we find out to what extent multimodal score level fusion improves the attention level estimation. The mEBAL database is used in the experimental framework, a public multi-modal database for attention level estimation obtained in an e-learning environment that contains data from 38 users while conducting several e-learning tasks of variable difficulty (creating changes in student cognitive loads).Comment: Preprint of the paper presented to the Workshop on Artificial Intelligence for Education (AI4EDU) of AAAI 202

    Hmm, You Seem Confused! Tracking Interlocutor Confusion for Situated Task-Oriented HRI

    Get PDF
    Our research seeks to develop a long-lasting and high-quality en- gagement between the user and the social robot, which in turn requires a more sophisticated alignment of the user and the system than is currently commonly available. Close monitoring of inter- locutors’ states, and we argue their confusion state in particular, and adjusting dialogue policies based on this state of confusion is needed for successful joint activity. In this paper, we present an ini- tial study of a human-robot conversation scenarios using a Pepper robot to investigate the confusion states of users. A Wizard-of-Oz (WoZ) HRI experiment is illustrated in detail with stimuli strategies to trigger confused states from interlocutors. For the collected data, we estimated emotions, head pose, and eye gaze, and these features were analysed against the silence duration time of the speech data and the post-study self-reported confusion states that are reported by participants. Our analysis found a significant relationship be- tween confusion states and most of these features. We see these results as being particularly significant for multimodal situated dialogues for human-robot interaction and beyond

    Detecting Interlocutor Confusion in Situated Human-Avatar Dialogue: A Pilot Study

    Get PDF
    In order to enhance levels of engagement with conversational systems, our long term research goal seeks to monitor the confusion state of a user and adapt dialogue policies in response to such user confusion states. To this end, in this paper, we present our initial research centred on a user-avatar dialogue scenario that we have developed to study the manifestation of confusion and in the long term its mitigation. We present a new definition of confusion that is particularly tailored to the requirements of intelligent conversational system development for task-oriented dialogue. We also present the details of our Wizard-of-Oz based data collection scenario wherein users interacted with a conversational avatar and were presented with stimuli that were in some cases designed to invoke a confused state in the user. Post study analysis of this data is also presented. Here, three pre-trained deep learning models were deployed to estimate base emotion, head pose and eye gaze. Despite a small pilot study group, our analysis demonstrates a significant relationship between these indicators and confusion states. We see this as a useful step forward in the automated analysis of the pragmatics of dialogue
    corecore