Search CORE

1,790 research outputs found

Markerless Motion Capture in the Crowd

Author: Bregler Christoph
Huston Thomas
Spiro Ian
Publication venue
Publication date: 01/01/2012
Field of study

This work uses crowdsourcing to obtain motion capture data from video recordings. The data is obtained by information workers who click repeatedly to indicate body configurations in the frames of a video, resulting in a model of 2D structure over time. We discuss techniques to optimize the tracking task and strategies for maximizing accuracy and efficiency. We show visualizations of a variety of motions captured with our pipeline then apply reconstruction techniques to derive 3D structure.Comment: Presented at Collective Intelligence conference, 2012 (arXiv:1204.2991

arXiv.org e-Print Archive

CiteSeerX

GANerated Hands for Real-time 3D Hand Tracking from Monocular RGB

Author: Bernard Florian
Casas Dan
Mehta Dushyant
Mueller Franziska
Sotnychenko Oleksandr
Sridhar Srinath
Theobalt Christian
Publication venue
Publication date: 01/01/2017
Field of study

We address the highly challenging problem of real-time 3D hand tracking based on a monocular RGB-only sequence. Our tracking method combines a convolutional neural network with a kinematic 3D hand model, such that it generalizes well to unseen data, is robust to occlusions and varying camera viewpoints, and leads to anatomically plausible as well as temporally smooth hand motions. For training our CNN we propose a novel approach for the synthetic generation of training data that is based on a geometrically consistent image-to-image translation network. To be more specific, we use a neural network that translates synthetic images to "real" images, such that the so-generated images follow the same statistical distribution as real-world hand images. For training this translation network we combine an adversarial loss and a cycle-consistency loss with a geometric consistency loss in order to preserve geometric properties (such as hand pose) during translation. We demonstrate that our hand tracking system outperforms the current state-of-the-art on challenging RGB-only footage

arXiv.org e-Print Archive

Crossref

MPG.PuRe

A decision forest based feature selection framework for action recognition from RGB-Depth cameras

Author: Akgul Ceyhun Burak
Akgül Ceyhun Burak
Ercil Aytul
Erçil Aytül
Negin Farhood
Ozdemir Firat
Yuksel Kamer Ali
Yüksel Kamer Ali
Özdemir Fırat
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/06/2013
Field of study

In this paper, we present an action recognition framework leveraging data mining capabilities of random decision forests trained on kinematic features. We describe human motion via a rich collection of kinematic feature time-series computed from the skeletal representation of the body in motion. We discriminatively optimize a random decision forest model over this collection to identify the most effective subset of features, localized both in time and space. Later, we train a support vector machine classifier on the selected features. This approach improves upon the baseline performance obtained using the whole feature set with a significantly less number of features (one tenth of the original). On MSRC-12 dataset (12 classes), our method achieves 94% accuracy. On the WorkoutSU-10 dataset, collected by our group (10 physical exercise classes), the accuracy is 98%. The approach can also be used to provide insights on the spatiotemporal dynamics of human actions

Sabanci University Research Database

Cognitive Robotics in Industrial Environments

Author: Heinz Wörn
Jürgen Graf
Stephan Puls
Publication venue: 'IntechOpen'
Publication date: 01/01/2012
Field of study

IntechOpen

Crossref

KITopen

Hand Keypoint Detection in Single Images using Multiview Bootstrapping

Author: Joo Hanbyul
Matthews Iain
Sheikh Yaser
Simon Tomas
Publication venue
Publication date: 25/04/2017
Field of study

We present an approach that uses a multi-camera system to train fine-grained detectors for keypoints that are prone to occlusion, such as the joints of a hand. We call this procedure multiview bootstrapping: first, an initial keypoint detector is used to produce noisy labels in multiple views of the hand. The noisy detections are then triangulated in 3D using multiview geometry or marked as outliers. Finally, the reprojected triangulations are used as new labeled training data to improve the detector. We repeat this process, generating more labeled data in each iteration. We derive a result analytically relating the minimum number of views to achieve target true and false positive rates for a given detector. The method is used to train a hand keypoint detector for single images. The resulting keypoint detector runs in realtime on RGB images and has accuracy comparable to methods that use depth sensors. The single view detector, triangulated over multiple views, enables 3D markerless hand motion capture with complex object interactions.Comment: CVPR 201

arXiv.org e-Print Archive

Crossref

InterCap: Joint Markerless 3D Tracking of Humans and Objects in Interaction

Author: Black M.J.
Huang Y.
Taheri O.
Tzionas D.
Publication venue
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Real-Time Human Motion Capture with Multiple Depth Cameras

Author: Little James J.
Shafaei Alireza
Publication venue
Publication date: 25/05/2016
Field of study

Commonly used human motion capture systems require intrusive attachment of markers that are visually tracked with multiple cameras. In this work we present an efficient and inexpensive solution to markerless motion capture using only a few Kinect sensors. Unlike the previous work on 3d pose estimation using a single depth camera, we relax constraints on the camera location and do not assume a co-operative user. We apply recent image segmentation techniques to depth images and use curriculum learning to train our system on purely synthetic data. Our method accurately localizes body parts without requiring an explicit shape model. The body joint locations are then recovered by combining evidence from multiple views in real-time. We also introduce a dataset of ~6 million synthetic depth frames for pose estimation from multiple cameras and exceed state-of-the-art results on the Berkeley MHAD dataset.Comment: Accepted to computer robot vision 201

arXiv.org e-Print Archive

Crossref

Markerless Vision-Based Skeleton Tracking in Therapy of Gross Motor Skill Disorders in Children

Author: B. Vanderborght
G. Buccino
J.C. Moreno
K. Buys
K. Khoshelham
L. Craighero
M. Brass
M. Gnjatović
M. Gnjatović
N. Krstić
R. A. Clark
S. Bojanin
T. Dutta
Publication venue: : Springer International Publishing
Publication date: 01/01/2014
Field of study

This chapter presents a research towards implementation of a computer vision system for markerless skeleton tracking in therapy of gross motor skill disorders in children suffering from mild cognitive impairment. The proposed system is based on a low-cost 3D sensor and a skeleton tracking software. The envisioned architecture is scalable in the sense that the system may be used as a stand-alone assistive tool for tracking the effects of therapy or it may be integrated with an advanced autonomous conversational agent to maintain the spatial attention of the child and to increase her motivation to undergo a long-term therapy

Crossref

Serbian Academy of Science and Arts Digital Archive (DAIS)