28 research outputs found
Benchmarking and Error Diagnosis in Multi-Instance Pose Estimation
We propose a new method to analyze the impact of errors in algorithms for
multi-instance pose estimation and a principled benchmark that can be used to
compare them. We define and characterize three classes of errors -
localization, scoring, and background - study how they are influenced by
instance attributes and their impact on an algorithm's performance. Our
technique is applied to compare the two leading methods for human pose
estimation on the COCO Dataset, measure the sensitivity of pose estimation with
respect to instance size, type and number of visible keypoints, clutter due to
multiple instances, and the relative score of instances. The performance of
algorithms, and the types of error they make, are highly dependent on all these
variables, but mostly on the number of keypoints and the clutter. The analysis
and software tools we propose offer a novel and insightful approach for
understanding the behavior of pose estimation algorithms and an effective
method for measuring their strengths and weaknesses.Comment: Project page available at
http://www.vision.caltech.edu/~mronchi/projects/PoseErrorDiagnosis/; Code
available at https://github.com/matteorr/coco-analyze; published at ICCV 1
Anchor Loss: Modulating Loss Scale Based on Prediction Difficulty
We propose a novel loss function that dynamically re-scales the cross entropy based on prediction difficulty regarding a sample. Deep neural network architectures in image classification tasks struggle to disambiguate visually similar objects. Likewise, in human pose estimation symmetric body parts often confuse the network with assigning indiscriminative scores to them. This is due to the output prediction, in which only the highest confidence label is selected without taking into consideration a measure of uncertainty. In this work, we define the prediction difficulty as a relative property coming from the confidence score gap between positive and negative labels. More precisely, the proposed loss function penalizes the network to avoid the score of a false prediction being significant. To demonstrate the efficacy of our loss function, we evaluate it on two different domains: image classification and human pose estimation. We find improvements in both applications by achieving higher accuracy compared to the baseline methods
It's all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data
We address the problem of 3D human pose estimation from 2D input images using
only weakly supervised training data. Despite showing considerable success for
2D pose estimation, the application of supervised machine learning to 3D pose
estimation in real world images is currently hampered by the lack of varied
training images with corresponding 3D poses. Most existing 3D pose estimation
algorithms train on data that has either been collected in carefully controlled
studio settings or has been generated synthetically. Instead, we take a
different approach, and propose a 3D human pose estimation algorithm that only
requires relative estimates of depth at training time. Such training signal,
although noisy, can be easily collected from crowd annotators, and is of
sufficient quality for enabling successful training and evaluation of 3D pose
algorithms. Our results are competitive with fully supervised regression based
approaches on the Human3.6M dataset, despite using significantly weaker
training data. Our proposed algorithm opens the door to using existing
widespread 2D datasets for 3D pose estimation by allowing fine-tuning with
noisy relative constraints, resulting in more accurate 3D poses.Comment: BMVC 2018. Project page available at
http://www.vision.caltech.edu/~mronchi/projects/RelativePos
A comparative analysis of pose estimation models as enablers for a smart-mirror physical rehabilitation system
Smart mirrors are gaining attention as a smart device that could integrate a set of functionalities intended to assist older adults in their day-to-day life. These devices are seamlessly integrated in the environment, providing a user-friendly interface and naturally fitting into the daily-care routines. People face a mirror several times a day, thus ensuring that any application running on a smart mirror will have several guaranteed interactions per day. It is therefore essential to detect when the user is in front of the mirror and also to interpret what he or she is doing. Very powerful and accurate libraries are currently available, but the limited computational resources and the need to work in real time limit the valid options for smart mirror devices. This paper therefore analyses and evaluates several body pose estimation models in order to determine which one can be deployed in a smart mirror-like device dedicated to supporting older adults in their physical rehabilitation routines.Los espejos inteligentes están ganando atención como un dispositivo inteligente que podrÃa integrar un conjunto de funcionalidades destinadas a ayudar a los adultos mayores en su dÃa a dÃa. Estos dispositivos se integran a la perfección en el entorno, brindan una interfaz fácil de usar y se adaptan naturalmente a las rutinas de cuidado diario. Las personas se enfrentan a un espejo varias veces al dÃa, lo que garantiza que cualquier aplicación que se ejecute en un espejo inteligente tendrá varias interacciones garantizadas por dÃa. Por lo tanto, es fundamental detectar cuándo el usuario está frente al espejo y también interpretar lo que está haciendo. Actualmente se encuentran disponibles bibliotecas muy potentes y precisas, pero los recursos computacionales limitados y la necesidad de trabajar en tiempo real limitan las opciones válidas para los dispositivos espejo inteligentes