Search CORE

3,063 research outputs found

VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera

Author: Casas Dan
Mehta Dushyant
Rhodin Helge
Seidel Hans-Peter
Shafiei Mohammad
Sotnychenko Oleksandr
Sridhar Srinath
Theobalt Christian
Xu Weipeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

We present the first real-time method to capture the full global 3D skeletal pose of a human in a stable, temporally consistent manner using a single RGB camera. Our method combines a new convolutional neural network (CNN) based pose regressor with kinematic skeleton fitting. Our novel fully-convolutional pose formulation regresses 2D and 3D joint positions jointly in real time and does not require tightly cropped input frames. A real-time kinematic skeleton fitting method uses the CNN output to yield temporally stable 3D global pose reconstructions on the basis of a coherent kinematic skeleton. This makes our approach the first monocular RGB method usable in real-time applications such as 3D character control---thus far, the only monocular methods for such applications employed specialized RGB-D cameras. Our method's accuracy is quantitatively on par with the best offline 3D monocular RGB pose estimation methods. Our results are qualitatively comparable to, and sometimes better than, results from monocular RGB-D approaches, such as the Kinect. However, we show that our approach is more broadly applicable than RGB-D solutions, i.e. it works for outdoor scenes, community videos, and low quality commodity RGB cameras.Comment: Accepted to SIGGRAPH 201

arXiv.org e-Print Archive

MPG.PuRe

A fast algorithm for vision-based hand gesture recognition for robot control

Author: Cetin Mujdat
Malima Asanterabi Kighoma
Ozgur Erol
Çetin Müjdat
Özgür Erol
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

We propose a fast algorithm for automatically recognizing a limited set of gestures from hand images for a robot control application. Hand gesture recognition is a challenging problem in its general form. We consider a fixed set of manual commands and a reasonably structured environment, and develop a simple, yet effective, procedure for gesture recognition. Our approach contains steps for segmenting the hand region, locating the fingers, and finally classifying the gesture. The algorithm is invariant to translation, rotation, and scale of the hand. We demonstrate the effectiveness of the technique on real imagery

CiteSeerX

Sabanci University Research Database

Bidirectional Conditional Generative Adversarial Networks

Author: AbdAlmageed Wael
Jaiswal Ayush
Natarajan Premkumar
Wu Yue
Publication venue
Publication date: 03/11/2018
Field of study

Conditional Generative Adversarial Networks (cGANs) are generative models that can produce data samples (

x

) conditioned on both latent variables (

z

) and known auxiliary information (

c

). We propose the Bidirectional cGAN (BiCoGAN), which effectively disentangles

z

and

c

in the generation process and provides an encoder that learns inverse mappings from

x

to both

z

and

c

, trained jointly with the generator and the discriminator. We present crucial techniques for training BiCoGANs, which involve an extrinsic factor loss along with an associated dynamically-tuned importance weight. As compared to other encoder-based cGANs, BiCoGANs encode

c

more accurately, and utilize

z

and

c

more effectively and in a more disentangled way to generate samples.Comment: To appear in Proceedings of ACCV 201

arXiv.org e-Print Archive

Crossref

Human Perambulation as a Self Calibrating Biometric

Author: Carter John
Goffredo Michela
Nixon Mark
Pearce Daniel
Spencer Nicholas
Publication venue: Springer Berlin / Heidelberg
Publication date: 01/01/2007
Field of study

This paper introduces a novel method of single camera gait reconstruction which is independent of the walking direction and of the camera parameters. Recognizing people by gait has unique advantages with respect to other biometric techniques: the identification of the walking subject is completely unobtrusive and the identification can be achieved at distance. Recently much research has been conducted into the recognition of frontoparallel gait. The proposed method relies on the very nature of walking to achieve the independence from walking direction. Three major assumptions have been done: human gait is cyclic; the distances between the bone joints are invariant during the execution of the movement; and the articulated leg motion is approximately planar, since almost all of the perceived motion is contained within a single limb swing plane. The method has been tested on several subjects walking freely along six different directions in a small enclosed area. The results show that recognition can be achieved without calibration and without dependence on view direction. The obtained results are particularly encouraging for future system development and for its application in real surveillance scenarios

Southampton (e-Prints Soton)

Archivio della Ricerca - Università di Roma 3

Convolutional Networks for Object Category and 3D Pose Estimation from 2D Images

Author: J Wu
K He
M Everingham
Y Wang
Publication venue
Publication date: 20/07/2018
Field of study

Current CNN-based algorithms for recovering the 3D pose of an object in an image assume knowledge about both the object category and its 2D localization in the image. In this paper, we relax one of these constraints and propose to solve the task of joint object category and 3D pose estimation from an image assuming known 2D localization. We design a new architecture for this task composed of a feature network that is shared between subtasks, an object categorization network built on top of the feature network, and a collection of category dependent pose regression networks. We also introduce suitable loss functions and a training method for the new architecture. Experiments on the challenging PASCAL3D+ dataset show state-of-the-art performance in the joint categorization and pose estimation task. Moreover, our performance on the joint task is comparable to the performance of state-of-the-art methods on the simpler 3D pose estimation with known object category task

arXiv.org e-Print Archive

Crossref

Everybody Dance Now

Author: Chan Caroline
Efros Alexei A.
Ginosar Shiry
Zhou Tinghui
Publication venue
Publication date: 27/08/2019
Field of study

This paper presents a simple method for "do as I do" motion transfer: given a source video of a person dancing, we can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves. We approach this problem as video-to-video translation using pose as an intermediate representation. To transfer the motion, we extract poses from the source subject and apply the learned pose-to-appearance mapping to generate the target subject. We predict two consecutive frames for temporally coherent video results and introduce a separate pipeline for realistic face synthesis. Although our method is quite simple, it produces surprisingly compelling results (see video). This motivates us to also provide a forensics tool for reliable synthetic content detection, which is able to distinguish videos synthesized by our system from real data. In addition, we release a first-of-its-kind open-source dataset of videos that can be legally used for training and motion transfer.Comment: In ICCV 201

arXiv.org e-Print Archive

Crossref