Top-down human pose estimation with depth images and domain adaptation

Abstract

In this paper, a method for estimation of human pose is proposed, making use of ToF (Time of Flight) cameras. For this, a YOLO based object detection method was used, to develop a top-down method. In the first stage, a network was developed to detect people in the image. In the second stage, a network was developed to estimate the joints of each person, using the image result from the first stage. We show that a deep learning network trained from scratch with ToF images yields better results than taking a deep neural network pretrained on RGB data and retraining it with ToF data. We also show that a top-down detector, with a person detector and a joint detector works better than detecting the body joints over the entire image.This work is supported by: European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project no 002797; Funding Reference: POCI-01-0247-FEDER-002797]

    Similar works