We present a discriminative approach to frame-by-frame head pose tracking that is robust to a wide range of illuminations and facial appearances and that is inherently immune to accuracy drift. Most previous research on head pose tracking has been validated on test datasets spanning only a small (< 20) subjects under controlled illumination conditions on continuous video sequences. In contrast, the system presented in this paper was both trained and tested on a much larger database, GENKI, spanning tens of thousands of different subjects, illuminations, and geographical locations from images on the Web. Our pose estimator achieves accuracy of 5.82 ◦ , 5.65 ◦ , and 2.96 ◦ root-meansquare (RMS) error for yaw, pitch, and roll, respectively. A set of 4000 images from this dataset, labeled for pose, was collected and released for use by the research community. 1
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.