28 research outputs found

    Benchmarking and Error Diagnosis in Multi-Instance Pose Estimation

    Get PDF
    We propose a new method to analyze the impact of errors in algorithms for multi-instance pose estimation and a principled benchmark that can be used to compare them. We define and characterize three classes of errors - localization, scoring, and background - study how they are influenced by instance attributes and their impact on an algorithm's performance. Our technique is applied to compare the two leading methods for human pose estimation on the COCO Dataset, measure the sensitivity of pose estimation with respect to instance size, type and number of visible keypoints, clutter due to multiple instances, and the relative score of instances. The performance of algorithms, and the types of error they make, are highly dependent on all these variables, but mostly on the number of keypoints and the clutter. The analysis and software tools we propose offer a novel and insightful approach for understanding the behavior of pose estimation algorithms and an effective method for measuring their strengths and weaknesses.Comment: Project page available at http://www.vision.caltech.edu/~mronchi/projects/PoseErrorDiagnosis/; Code available at https://github.com/matteorr/coco-analyze; published at ICCV 1

    Anchor Loss: Modulating Loss Scale Based on Prediction Difficulty

    Get PDF
    We propose a novel loss function that dynamically re-scales the cross entropy based on prediction difficulty regarding a sample. Deep neural network architectures in image classification tasks struggle to disambiguate visually similar objects. Likewise, in human pose estimation symmetric body parts often confuse the network with assigning indiscriminative scores to them. This is due to the output prediction, in which only the highest confidence label is selected without taking into consideration a measure of uncertainty. In this work, we define the prediction difficulty as a relative property coming from the confidence score gap between positive and negative labels. More precisely, the proposed loss function penalizes the network to avoid the score of a false prediction being significant. To demonstrate the efficacy of our loss function, we evaluate it on two different domains: image classification and human pose estimation. We find improvements in both applications by achieving higher accuracy compared to the baseline methods

    It's all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data

    Get PDF
    We address the problem of 3D human pose estimation from 2D input images using only weakly supervised training data. Despite showing considerable success for 2D pose estimation, the application of supervised machine learning to 3D pose estimation in real world images is currently hampered by the lack of varied training images with corresponding 3D poses. Most existing 3D pose estimation algorithms train on data that has either been collected in carefully controlled studio settings or has been generated synthetically. Instead, we take a different approach, and propose a 3D human pose estimation algorithm that only requires relative estimates of depth at training time. Such training signal, although noisy, can be easily collected from crowd annotators, and is of sufficient quality for enabling successful training and evaluation of 3D pose algorithms. Our results are competitive with fully supervised regression based approaches on the Human3.6M dataset, despite using significantly weaker training data. Our proposed algorithm opens the door to using existing widespread 2D datasets for 3D pose estimation by allowing fine-tuning with noisy relative constraints, resulting in more accurate 3D poses.Comment: BMVC 2018. Project page available at http://www.vision.caltech.edu/~mronchi/projects/RelativePos

    A comparative analysis of pose estimation models as enablers for a smart-mirror physical rehabilitation system

    Get PDF
    Smart mirrors are gaining attention as a smart device that could integrate a set of functionalities intended to assist older adults in their day-to-day life. These devices are seamlessly integrated in the environment, providing a user-friendly interface and naturally fitting into the daily-care routines. People face a mirror several times a day, thus ensuring that any application running on a smart mirror will have several guaranteed interactions per day. It is therefore essential to detect when the user is in front of the mirror and also to interpret what he or she is doing. Very powerful and accurate libraries are currently available, but the limited computational resources and the need to work in real time limit the valid options for smart mirror devices. This paper therefore analyses and evaluates several body pose estimation models in order to determine which one can be deployed in a smart mirror-like device dedicated to supporting older adults in their physical rehabilitation routines.Los espejos inteligentes están ganando atención como un dispositivo inteligente que podría integrar un conjunto de funcionalidades destinadas a ayudar a los adultos mayores en su día a día. Estos dispositivos se integran a la perfección en el entorno, brindan una interfaz fácil de usar y se adaptan naturalmente a las rutinas de cuidado diario. Las personas se enfrentan a un espejo varias veces al día, lo que garantiza que cualquier aplicación que se ejecute en un espejo inteligente tendrá varias interacciones garantizadas por día. Por lo tanto, es fundamental detectar cuándo el usuario está frente al espejo y también interpretar lo que está haciendo. Actualmente se encuentran disponibles bibliotecas muy potentes y precisas, pero los recursos computacionales limitados y la necesidad de trabajar en tiempo real limitan las opciones válidas para los dispositivos espejo inteligentes
    corecore