Our recent work suggests that, thanks to nowadays powerful CNNs, image-based
2D pose estimation is a promising cue for determining pedestrian intentions
such as crossing the road in the path of the ego-vehicle, stopping before
entering the road, and starting to walk or bending towards the road. This
statement is based on the results obtained on non-naturalistic sequences
(Daimler dataset), i.e. in sequences choreographed specifically for performing
the study. Fortunately, a new publicly available dataset (JAAD) has appeared
recently to allow developing methods for detecting pedestrian intentions in
naturalistic driving conditions; more specifically, for addressing the relevant
question is the pedestrian going to cross? Accordingly, in this paper we use
JAAD to assess the usefulness of 2D pose estimation for answering such a
question. We combine CNN-based pedestrian detection, tracking and pose
estimation to predict the crossing action from monocular images. Overall, the
proposed pipeline provides new state-of-the-art results.Comment: This is a paper presented in IEEE Intelligent Vehicles Symposium
(IEEE IV 2018