Search CORE

27 research outputs found

Multi-Person Pose Estimation with Local Joint-to-Person Associations

Author: A Newell
Adrian Bulat
D Tran
E Insafutdinov
M Andriluka
M Dantone
M Eichner
PF Felzenszwalb
Y Yang
Publication venue
Publication date: 31/08/2016
Field of study

Despite of the recent success of neural networks for human pose estimation, current approaches are limited to pose estimation of a single person and cannot handle humans in groups or crowds. In this work, we propose a method that estimates the poses of multiple persons in an image in which a person can be occluded by another person or might be truncated. To this end, we consider multi-person pose estimation as a joint-to-person association problem. We construct a fully connected graph from a set of detected joint candidates in an image and resolve the joint-to-person association and outlier detection using integer linear programming. Since solving joint-to-person association jointly for all persons in an image is an NP-hard problem and even approximations are expensive, we solve the problem locally for each person. On the challenging MPII Human Pose Dataset for multiple persons, our approach achieves the accuracy of a state-of-the-art method, but it is 6,000 to 19,000 times faster.Comment: Accepted to European Conference on Computer Vision (ECCV) Workshops, Crowd Understanding, 201

arXiv.org e-Print Archive

Crossref

Self Adversarial Training for Human Pose Estimation

Author: arjovsky
belagiannis
berthelot
bulat
cao
carreira
chen
chen
chu
gkioxari
gong
goodfellow
gulrajani
insafutdinov
isola
ledig
lifshitz
luc
mirza
newell
pan
pishchulin
radford
rafi
ramakrishna
tompson
wei
zhao
Publication venue
Publication date: 15/08/2017
Field of study

This paper presents a deep learning based approach to the problem of human pose estimation. We employ generative adversarial networks as our learning paradigm in which we set up two stacked hourglass networks with the same architecture, one as the generator and the other as the discriminator. The generator is used as a human pose estimator after the training is done. The discriminator distinguishes ground-truth heatmaps from generated ones, and back-propagates the adversarial loss to the generator. This process enables the generator to learn plausible human body configurations and is shown to be useful for improving the prediction accuracy.Comment: CVPR 2017 Workshop on Visual Understanding of Humans in Crowd Scene and the 1st Look Into Person (LIP) Challeng

arXiv.org e-Print Archive

Crossref

Holistic, Instance-Level Human Parsing

Author: Arnab Anurag
Li Qizhu
Torr Philip H. S.
Publication venue
Publication date: 01/01/2017
Field of study

Object parsing -- the task of decomposing an object into its semantic parts -- has traditionally been formulated as a category-level segmentation problem. Consequently, when there are multiple objects in an image, current methods cannot count the number of objects in the scene, nor can they determine which part belongs to which object. We address this problem by segmenting the parts of objects at an instance-level, such that each pixel in the image is assigned a part label, as well as the identity of the object it belongs to. Moreover, we show how this approach benefits us in obtaining segmentations at coarser granularities as well. Our proposed network is trained end-to-end given detections, and begins with a category-level segmentation module. Thereafter, a differentiable Conditional Random Field, defined over a variable number of instances for every input image, reasons about the identity of each part by associating it with a human detection. In contrast to other approaches, our method can handle the varying number of people in each image and our holistic network produces state-of-the-art results in instance-level part and human segmentation, together with competitive results in category-level part segmentation, all achieved by a single forward-pass through our neural network.Comment: Poster at BMVC 201

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

DeeperCut: A Deeper, Stronger, and Faster Multi-Person Pose Estimation Model

Author: Andres Bjoern
Andriluka Mykhaylo
Insafutdinov Eldar
Pishchulin Leonid
Schiele Bernt
Publication venue
Publication date: 01/01/2016
Field of study

The goal of this paper is to advance the state-of-the-art of articulated pose estimation in scenes with multiple people. To that end we contribute on three fronts. We propose (1) improved body part detectors that generate effective bottom-up proposals for body parts; (2) novel image-conditioned pairwise terms that allow to assemble the proposals into a variable number of consistent body part configurations; and (3) an incremental optimization strategy that explores the search space more efficiently thus leading both to better performance and significant speed-up factors. Evaluation is done on two single-person and two multi-person pose estimation benchmarks. The proposed approach significantly outperforms best known multi-person pose estimation results while demonstrating competitive performance on the task of single person pose estimation. Models and code available at http://pose.mpi-inf.mpg.deComment: ECCV'16. High-res version at https://www.d2.mpi-inf.mpg.de/sites/default/files/insafutdinov16arxiv.pd

arXiv.org e-Print Archive

Crossref

CISPA – Helmholtz-Zentrum für Informationssicherheit

MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network

Author: A Bulat
A Newell
I Lifshitz
M Everingham
T-Y Lin
U Iqbal
V Ramakrishna
Publication venue
Publication date: 11/07/2018
Field of study

In this paper, we present MultiPoseNet, a novel bottom-up multi-person pose estimation architecture that combines a multi-task model with a novel assignment method. MultiPoseNet can jointly handle person detection, keypoint detection, person segmentation and pose estimation problems. The novel assignment method is implemented by the Pose Residual Network (PRN) which receives keypoint and person detections, and produces accurate poses by assigning keypoints to person instances. On the COCO keypoints dataset, our pose estimation method outperforms all previous bottom-up methods both in accuracy (+4-point mAP over previous best result) and speed; it also performs on par with the best top-down methods while being at least 4x faster. Our method is the fastest real time system with 23 frames/sec. Source code is available at: https://github.com/mkocabas/pose-residual-networkComment: to appear in ECCV 201

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

JPPF: Multi-task Fusion for Consistent Panoptic-Part Segmentation

Author: Jagadeesh Sravan Kumar
Muralidhara Shishir
Schuster René
Stricker Didier
Publication venue
Publication date: 30/11/2023
Field of study

Part-aware panoptic segmentation is a problem of computer vision that aims to provide a semantic understanding of the scene at multiple levels of granularity. More precisely, semantic areas, object instances, and semantic parts are predicted simultaneously. In this paper, we present our Joint Panoptic Part Fusion (JPPF) that combines the three individual segmentations effectively to obtain a panoptic-part segmentation. Two aspects are of utmost importance for this: First, a unified model for the three problems is desired that allows for mutually improved and consistent representation learning. Second, balancing the combination so that it gives equal importance to all individual results during fusion. Our proposed JPPF is parameter-free and dynamically balances its input. The method is evaluated and compared on the Cityscapes Panoptic Parts (CPP) and Pascal Panoptic Parts (PPP) datasets in terms of PartPQ and Part-Whole Quality (PWQ). In extensive experiments, we verify the importance of our fair fusion, highlight its most significant impact for areas that can be further segmented into parts, and demonstrate the generalization capabilities of our design without fine-tuning on 5 additional datasets.Comment: Accepted for Springer Nature Computer Science. arXiv admin note: substantial text overlap with arXiv:2212.0767

arXiv.org e-Print Archive