10 research outputs found
DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis
Vision is the richest and most cost-effective technology for Driver
Monitoring Systems (DMS), especially after the recent success of Deep Learning
(DL) methods. The lack of sufficiently large and comprehensive datasets is
currently a bottleneck for the progress of DMS development, crucial for the
transition of automated driving from SAE Level-2 to SAE Level-3. In this paper,
we introduce the Driver Monitoring Dataset (DMD), an extensive dataset which
includes real and simulated driving scenarios: distraction, gaze allocation,
drowsiness, hands-wheel interaction and context data, in 41 hours of RGB, depth
and IR videos from 3 cameras capturing face, body and hands of 37 drivers. A
comparison with existing similar datasets is included, which shows the DMD is
more extensive, diverse, and multi-purpose. The usage of the DMD is illustrated
by extracting a subset of it, the dBehaviourMD dataset, containing 13
distraction activities, prepared to be used in DL training processes.
Furthermore, we propose a robust and real-time driver behaviour recognition
system targeting a real-world application that can run on cost-efficient
CPU-only platforms, based on the dBehaviourMD. Its performance is evaluated
with different types of fusion strategies, which all reach enhanced accuracy
still providing real-time response.Comment: Accepted to ECCV 2020 workshop - Assistive Computer Vision and
Robotic
Domain Adaptation for Head Pose Estimation Using Relative Pose Consistency
Head pose estimation plays a vital role in biometric systems related to facial and human behavior analysis. Typically, neural networks are trained on head pose datasets. Unfortunately, manual or sensor-based annotation of head pose is impractical. A solution is synthetic training data generated from 3D face models, which can provide an infinite number of perfect labels. However, computer generated images only provide an approximation of real-world images, leading to a performance gap between training and application domain. Therefore, there is a need for strategies that allow simultaneous learning on labeled synthetic data and unlabeled real-world data to overcome the domain gap. In this work we propose relative pose consistency, a semi-supervised learning strategy for head pose estimation based on consistency regularization. Consistency regularization enforces consistent network predictions under random image augmentations, including pose-preserving and pose-altering augmentations. We propose a strategy to exploit the relative pose introduced by pose-altering augmentations between augmented image pairs, to allow the network to benefit from relative pose labels during training on unlabeled data. We evaluate our approach in a domain-adaptation scenario and in a commonly used cross-dataset scenario. Furthermore, we reproduce related works to enforce consistent evaluation protocols and show that for both scenarios we outperform SOTA
Deep Learning for Head Pose Estimation: A Survey
Head pose estimation (HPE) is an active and popular area of research. Over the years, many approaches have constantly been developed, leading to a progressive improvement in accuracy; nevertheless, head pose estimation remains an open research topic, especially in unconstrained environments. In this paper, we will review the increasing amount of available datasets and the modern methodologies used to estimate orientation, with a special attention to deep learning techniques. We will discuss the evolution of the feld by proposing a classifcation of head pose estimation methods, explaining their advantages and disadvantages, and highlighting the diferent ways deep learning techniques have been used in the context of HPE. An
in-depth performance comparison and discussion is presented at the end of the work. We also highlight the most promising research directions for future investigations on the topic