3 research outputs found
Dynamics of Driver's Gaze: Explorations in Behavior Modeling & Maneuver Prediction
The study and modeling of driver's gaze dynamics is important because, if and
how the driver is monitoring the driving environment is vital for driver
assistance in manual mode, for take-over requests in highly automated mode and
for semantic perception of the surround in fully autonomous mode. We developed
a machine vision based framework to classify driver's gaze into context rich
zones of interest and model driver's gaze behavior by representing gaze
dynamics over a time period using gaze accumulation, glance duration and glance
frequencies. As a use case, we explore the driver's gaze dynamic patterns
during maneuvers executed in freeway driving, namely, left lane change
maneuver, right lane change maneuver and lane keeping. It is shown that
condensing gaze dynamics into durations and frequencies leads to recurring
patterns based on driver activities. Furthermore, modeling these patterns show
predictive powers in maneuver detection up to a few hundred milliseconds a
priori
Driver Gaze Zone Estimation using Convolutional Neural Networks: A General Framework and Ablative Analysis
Driver gaze has been shown to be an excellent surrogate for driver attention
in intelligent vehicles. With the recent surge of highly autonomous vehicles,
driver gaze can be useful for determining the handoff time to a human driver.
While there has been significant improvement in personalized driver gaze zone
estimation systems, a generalized system which is invariant to different
subjects, perspectives and scales is still lacking. We take a step towards this
generalized system using Convolutional Neural Networks (CNNs). We finetune 4
popular CNN architectures for this task, and provide extensive comparisons of
their outputs. We additionally experiment with different input image patches,
and also examine how image size affects performance. For training and testing
the networks, we collect a large naturalistic driving dataset comprising of 11
long drives, driven by 10 subjects in two different cars. Our best performing
model achieves an accuracy of 95.18% during cross-subject testing,
outperforming current state of the art techniques for this task. Finally, we
evaluate our best performing model on the publicly available Columbia Gaze
Dataset comprising of images from 56 subjects with varying head pose and gaze
directions. Without any training, our model successfully encodes the different
gaze directions on this diverse dataset, demonstrating good generalization
capabilities
An Occluded Stacked Hourglass Approach to Facial Landmark Localization and Occlusion Estimation
A key step to driver safety is to observe the driver's activities with the
face being a key step in this process to extracting information such as head
pose, blink rate, yawns, talking to passenger which can then help derive higher
level information such as distraction, drowsiness, intent, and where they are
looking. In the context of driving safety, it is important for the system
perform robust estimation under harsh lighting and occlusion but also be able
to detect when the occlusion occurs so that information predicted from occluded
parts of the face can be taken into account properly. This paper introduces the
Occluded Stacked Hourglass, based on the work of original Stacked Hourglass
network for body pose joint estimation, which is retrained to process a
detected face window and output 68 occlusion heat maps, each corresponding to a
facial landmark. Landmark location, occlusion levels and a refined face
detection score, to reject false positives, are extracted from these heat maps.
Using the facial landmark locations, features such as head pose and eye/mouth
openness can be extracted to derive driver attention and activity. The system
is evaluated for face detection, head pose, and occlusion estimation on various
datasets in the wild, both quantitatively and qualitatively, and shows
state-of-the-art results