80,362 research outputs found
Multi-Person Pose Estimation with Local Joint-to-Person Associations
Despite of the recent success of neural networks for human pose estimation,
current approaches are limited to pose estimation of a single person and cannot
handle humans in groups or crowds. In this work, we propose a method that
estimates the poses of multiple persons in an image in which a person can be
occluded by another person or might be truncated. To this end, we consider
multi-person pose estimation as a joint-to-person association problem. We
construct a fully connected graph from a set of detected joint candidates in an
image and resolve the joint-to-person association and outlier detection using
integer linear programming. Since solving joint-to-person association jointly
for all persons in an image is an NP-hard problem and even approximations are
expensive, we solve the problem locally for each person. On the challenging
MPII Human Pose Dataset for multiple persons, our approach achieves the
accuracy of a state-of-the-art method, but it is 6,000 to 19,000 times faster.Comment: Accepted to European Conference on Computer Vision (ECCV) Workshops,
Crowd Understanding, 201
DeeperCut: A Deeper, Stronger, and Faster Multi-Person Pose Estimation Model
The goal of this paper is to advance the state-of-the-art of articulated pose
estimation in scenes with multiple people. To that end we contribute on three
fronts. We propose (1) improved body part detectors that generate effective
bottom-up proposals for body parts; (2) novel image-conditioned pairwise terms
that allow to assemble the proposals into a variable number of consistent body
part configurations; and (3) an incremental optimization strategy that explores
the search space more efficiently thus leading both to better performance and
significant speed-up factors. Evaluation is done on two single-person and two
multi-person pose estimation benchmarks. The proposed approach significantly
outperforms best known multi-person pose estimation results while demonstrating
competitive performance on the task of single person pose estimation. Models
and code available at http://pose.mpi-inf.mpg.deComment: ECCV'16. High-res version at
https://www.d2.mpi-inf.mpg.de/sites/default/files/insafutdinov16arxiv.pd
Towards Accurate Multi-person Pose Estimation in the Wild
We propose a method for multi-person detection and 2-D pose estimation that
achieves state-of-art results on the challenging COCO keypoints task. It is a
simple, yet powerful, top-down approach consisting of two stages.
In the first stage, we predict the location and scale of boxes which are
likely to contain people; for this we use the Faster RCNN detector. In the
second stage, we estimate the keypoints of the person potentially contained in
each proposed bounding box. For each keypoint type we predict dense heatmaps
and offsets using a fully convolutional ResNet. To combine these outputs we
introduce a novel aggregation procedure to obtain highly localized keypoint
predictions. We also use a novel form of keypoint-based Non-Maximum-Suppression
(NMS), instead of the cruder box-level NMS, and a novel form of keypoint-based
confidence score estimation, instead of box-level scoring.
Trained on COCO data alone, our final system achieves average precision of
0.649 on the COCO test-dev set and the 0.643 test-standard sets, outperforming
the winner of the 2016 COCO keypoints challenge and other recent state-of-art.
Further, by using additional in-house labeled data we obtain an even higher
average precision of 0.685 on the test-dev set and 0.673 on the test-standard
set, more than 5% absolute improvement compared to the previous best performing
method on the same dataset.Comment: Paper describing an improved version of the G-RMI entry to the 2016
COCO keypoints challenge (http://image-net.org/challenges/ilsvrc+coco2016).
Camera ready version to appear in the Proceedings of CVPR 201
Activity-conditioned continuous human pose estimation for performance analysis of athletes using the example of swimming
In this paper we consider the problem of human pose estimation in real-world
videos of swimmers. Swimming channels allow filming swimmers simultaneously
above and below the water surface with a single stationary camera. These
recordings can be used to quantitatively assess the athletes' performance. The
quantitative evaluation, so far, requires manual annotations of body parts in
each video frame. We therefore apply the concept of CNNs in order to
automatically infer the required pose information. Starting with an
off-the-shelf architecture, we develop extensions to leverage activity
information - in our case the swimming style of an athlete - and the continuous
nature of the video recordings. Our main contributions are threefold: (a) We
apply and evaluate a fine-tuned Convolutional Pose Machine architecture as a
baseline in our very challenging aquatic environment and discuss its error
modes, (b) we propose an extension to input swimming style information into the
fully convolutional architecture and (c) modify the architecture for continuous
pose estimation in videos. With these additions we achieve reliable pose
estimates with up to +16% more correct body joint detections compared to the
baseline architecture.Comment: 10 pages, 9 figures, accepted at WACV 201
- …