Search CORE

16 research outputs found

Subjective Annotations for Vision-Based Attention Level Estimation

Author: Coifman Andrea
Kristoffersen Miklas S.
Rohoska Péter
Shepstone Sven E.
Tan Zheng-Hua
Publication venue
Publication date: 01/01/2019
Field of study

Attention level estimation systems have a high potential in many use cases, such as human-robot interaction, driver modeling and smart home systems, since being able to measure a person's attention level opens the possibility to natural interaction between humans and computers. The topic of estimating a human's visual focus of attention has been actively addressed recently in the field of HCI. However, most of these previous works do not consider attention as a subjective, cognitive attentive state. New research within the field also faces the problem of the lack of annotated datasets regarding attention level in a certain context. The novelty of our work is two-fold: First, we introduce a new annotation framework that tackles the subjective nature of attention level and use it to annotate more than 100,000 images with three attention levels and second, we introduce a novel method to estimate attention levels, relying purely on extracted geometric features from RGB and depth images, and evaluate it with a deep learning fusion framework. The system achieves an overall accuracy of 80.02%. Our framework and attention level annotations are made publicly available.Comment: 14th International Conference on Computer Vision Theory and Application

arXiv.org e-Print Archive

Crossref

VBN

MATT: Multimodal Attention Level Estimation for e-learning Platforms

Author: Cobos Ruth
Daza Roberto
Fierrez Julian
Gomez Luis F.
Morales Aythami
Ortega-Garcia Javier
Tolosana Ruben
Publication venue
Publication date: 22/01/2023
Field of study

This work presents a new multimodal system for remote attention level estimation based on multimodal face analysis. Our multimodal approach uses different parameters and signals obtained from the behavior and physiological processes that have been related to modeling cognitive load such as faces gestures (e.g., blink rate, facial actions units) and user actions (e.g., head pose, distance to the camera). The multimodal system uses the following modules based on Convolutional Neural Networks (CNNs): Eye blink detection, head pose estimation, facial landmark detection, and facial expression features. First, we individually evaluate the proposed modules in the task of estimating the student's attention level captured during online e-learning sessions. For that we trained binary classifiers (high or low attention) based on Support Vector Machines (SVM) for each module. Secondly, we find out to what extent multimodal score level fusion improves the attention level estimation. The mEBAL database is used in the experimental framework, a public multi-modal database for attention level estimation obtained in an e-learning environment that contains data from 38 users while conducting several e-learning tasks of variable difficulty (creating changes in student cognitive loads).Comment: Preprint of the paper presented to the Workshop on Artificial Intelligence for Education (AI4EDU) of AAAI 202

arXiv.org e-Print Archive

mEBAL: A Multimodal Database for Eye Blink Detection and Attention Level Estimation

Author: Chen Peijin
Drutarovsky Tomas
Hernandez-Ortega Javier
Hernandez-Ortega Javier
Hu Guilei
Jain Anil
Jung TackHyun
Leal Sharon
Li Yuezun
Pooneh
Schiffman Harvey Richard
Shen Liping
Simonyan Karen
Soukupovà Tereza
Tadas Baltruvs
Tolosana Ruben
Publication venue
Publication date: 09/06/2020
Field of study

This work presents mEBAL, a multimodal database for eye blink detection and attention level estimation. The eye blink frequency is related to the cognitive activity and automatic detectors of eye blinks have been proposed for many tasks including attention level estimation, analysis of neuro-degenerative diseases, deception recognition, drive fatigue detection, or face anti-spoofing. However, most existing databases and algorithms in this area are limited to experiments involving only a few hundred samples and individual sensors like face cameras. The proposed mEBAL improves previous databases in terms of acquisition sensors and samples. In particular, three different sensors are simultaneously considered: Near Infrared (NIR) and RGB cameras to capture the face gestures and an Electroencephalography (EEG) band to capture the cognitive activity of the user and blinking events. Regarding the size of mEBAL, it comprises 6,000 samples and the corresponding attention level from 38 different students while conducting a number of e-learning tasks of varying difficulty. In addition to presenting mEBAL, we also include preliminary experiments on: i) eye blink detection using Convolutional Neural Networks (CNN) with the facial images, and ii) attention level estimation of the students based on their eye blink frequency

arXiv.org e-Print Archive

Crossref

Biblos-e Archivo

mEBAL2 Database and Benchmark: Image-based Multispectral Eyeblink Detection

Author: Daza Roberto
Fierrez Julian
Morales Aythami
Tolosana Ruben
Vera-Rodriguez Ruben
Publication venue
Publication date: 14/09/2023
Field of study

This work introduces a new multispectral database and novel approaches for eyeblink detection in RGB and Near-Infrared (NIR) individual images. Our contributed dataset (mEBAL2, multimodal Eye Blink and Attention Level estimation, Version 2) is the largest existing eyeblink database, representing a great opportunity to improve data-driven multispectral approaches for blink detection and related applications (e.g., attention level estimation and presentation attack detection in face biometrics). mEBAL2 includes 21,100 image sequences from 180 different students (more than 2 million labeled images in total) while conducting a number of e-learning tasks of varying difficulty or taking a real course on HTML initiation through the edX MOOC platform. mEBAL2 uses multiple sensors, including two Near-Infrared (NIR) and one RGB camera to capture facial gestures during the execution of the tasks, as well as an Electroencephalogram (EEG) band to get the cognitive activity of the user and blinking events. Furthermore, this work proposes a Convolutional Neural Network architecture as benchmark for blink detection on mEBAL2 with performances up to 97%. Different training methodologies are implemented using the RGB spectrum, NIR spectrum, and the combination of both to enhance the performance on existing eyeblink detectors. We demonstrate that combining NIR and RGB images during training improves the performance of RGB eyeblink detectors (i.e., detection based only on a RGB image). Finally, the generalization capacity of the proposed eyeblink detectors is validated in wilder and more challenging environments like the HUST-LEBW dataset to show the usefulness of mEBAL2 to train a new generation of data-driven approaches for eyeblink detection.Comment: This paper is under consideration at Pattern Recognition Letter

arXiv.org e-Print Archive

M2LADS: A System for Generating MultiModal Learning Analytics Dashboards

Author: Becerra A
Cobos R
Cukurova M
Daza R
Fierrez J
Morales A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

In this article, we present a Web-based System called M2LADS, which supports the integration and visualization of multimodal data recorded in learning sessions in a MOOC in the form of Web-based Dashboards. Based on the edBB platform, the multimodal data gathered contains biometric and behavioral signals including electroencephalogram data to measure learners' cognitive attention, heart rate for affective measures, visual attention from the video recordings. Additionally, learners' static background data and their learning performance measures are tracked using LOGCE and MOOC tracking logs respectively, and both are included in the Web-based System. M2LADS provides opportunities to capture learners' holistic experience during their interactions with the MOOC, which can in turn be used to improve their learning outcomes through feedback visualizations and interventions, as well as to enhance learning analytics models and improve the open content of the MOOC

UCL Discovery

Real-time Multi-person Eyeblink Detection in the Wild for Untrimmed Video

Author: Cao Zhiguo
Fang Zhiwen
Gan Jinfang
Wei Sicheng
Xiao Yang
Zeng Wenzheng
Zhang Xintao
Zhou Joey Tianyi
Publication venue
Publication date: 28/03/2023
Field of study

Real-time eyeblink detection in the wild can widely serve for fatigue detection, face anti-spoofing, emotion analysis, etc. The existing research efforts generally focus on single-person cases towards trimmed video. However, multi-person scenario within untrimmed videos is also important for practical applications, which has not been well concerned yet. To address this, we shed light on this research field for the first time with essential contributions on dataset, theory, and practices. In particular, a large-scale dataset termed MPEblink that involves 686 untrimmed videos with 8748 eyeblink events is proposed under multi-person conditions. The samples are captured from unconstrained films to reveal "in the wild" characteristics. Meanwhile, a real-time multi-person eyeblink detection method is also proposed. Being different from the existing counterparts, our proposition runs in a one-stage spatio-temporal way with end-to-end learning capacity. Specifically, it simultaneously addresses the sub-tasks of face detection, face tracking, and human instance-level eyeblink detection. This paradigm holds 2 main advantages: (1) eyeblink features can be facilitated via the face's global context (e.g., head pose and illumination condition) with joint optimization and interaction, and (2) addressing these sub-tasks in parallel instead of sequential manner can save time remarkably to meet the real-time running requirement. Experiments on MPEblink verify the essential challenges of real-time multi-person eyeblink detection in the wild for untrimmed video. Our method also outperforms existing approaches by large margins and with a high inference speed.Comment: Accepted by CVPR 202

arXiv.org e-Print Archive

Can adas distract driver’s attention? An rgb-d camera and deep learning-based analysis

Author: Caruso G.
Marcolin F.
Moos S.
Nonis F.
Shi Y.
Ulrich L.
Vezzetti E.
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Driver inattention is the primary cause of vehicle accidents; hence, manufacturers have introduced systems to support the driver and improve safety; nonetheless, advanced driver assistance systems (ADAS) must be properly designed not to become a potential source of distraction for the driver due to the provided feedback. In the present study, an experiment involving auditory and haptic ADAS has been conducted involving 11 participants, whose attention has been monitored during their driving experience. An RGB-D camera has been used to acquire the drivers’ face data. Subsequently, these images have been analyzed using a deep learning-based approach, i.e., a convolutional neural network (CNN) specifically trained to perform facial expression recognition (FER). Analyses to assess possible relationships between these results and both ADAS activations and event occurrences, i.e., accidents, have been carried out. A correlation between attention and accidents emerged, whilst facial expressions and ADAS activations resulted to be not correlated, thus no evidence that the designed ADAS are a possible source of distraction has been found. In addition to the experimental results, the proposed approach has proved to be an effective tool to monitor the driver through the usage of non-invasive techniques

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Face Image Quality Assessment: A Literature Survey

Author: Busch Christoph
Fierrez Julian
Galbally Javier
Henniger Olaf
Rathgeb Christian
Schlett Torsten
Publication venue
Publication date: 25/10/2021
Field of study

The performance of face analysis and recognition systems depends on the quality of the acquired face data, which is influenced by numerous factors. Automatically assessing the quality of face data in terms of biometric utility can thus be useful to detect low-quality data and make decisions accordingly. This survey provides an overview of the face image quality assessment literature, which predominantly focuses on visible wavelength face image input. A trend towards deep learning based methods is observed, including notable conceptual differences among the recent approaches, such as the integration of quality assessment into face recognition models. Besides image selection, face image quality assessment can also be used in a variety of other application scenarios, which are discussed herein. Open issues and challenges are pointed out, i.a. highlighting the importance of comparability for algorithm evaluations, and the challenge for future work to create deep learning approaches that are interpretable in addition to providing accurate utility predictions

arXiv.org e-Print Archive

Biblos-e Archivo

Measuring Brain Activation Patterns from Raw Single-Channel EEG during Exergaming: A Pilot Study

Author: Claudia Ferraris
Gabriella Olmo
Gianluca Amprimo
Irene Rechichi
Publication venue: 'MDPI AG'
Publication date: 01/01/2023
Field of study

Physical and cognitive rehabilitation is deemed crucial to attenuate symptoms and to improve the quality of life in people with neurodegenerative disorders, such as Parkinson's Disease. Among rehabilitation strategies, a novel and popular approach relies on exergaming: the patient performs a motor or cognitive task within an interactive videogame in a virtual environment. These strategies may widely benefit from being tailored to the patient's needs and engagement patterns. In this pilot study, we investigated the ability of a low-cost BCI based on single-channel EEG to measure the user's engagement during an exergame. As a first step, healthy subjects were recruited to assess the system's capability to distinguish between (1) rest and gaming conditions and (2) gaming at different complexity levels, through Machine Learning supervised models. Both EEG and eye-blink features were employed. The results indicate the ability of the exergame to stimulate engagement and the capability of the supervised classification models to distinguish resting stage from game-play(accuracy > 95%). Finally, different clusters of subject responses throughout the game were identified, which could help define models of engagement trends. This result is a starting point in developing an effectively subject-tailored exergaming system

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)