15,407 research outputs found
Learning to Reconstruct People in Clothing from a Single RGB Camera
We present a learning-based model to infer the personalized 3D shape of people from a few frames (1-8) of a monocular video in which the person is moving, in less than 10 seconds with a reconstruction accuracy of 5mm. Our model learns to predict the parameters of a statistical body model and instance displacements that add clothing and hair to the shape. The model achieves fast and accurate predictions based on two key design choices. First, by predicting shape in a canonical T-pose space, the network learns to encode the images of the person into pose-invariant latent codes, where the information is fused. Second, based on the observation that feed-forward predictions are fast but do not always align with the input images, we predict using both, bottom-up and top-down streams (one per view) allowing information to flow in both directions. Learning relies only on synthetic 3D data. Once learned, the model can take a variable number of frames as input, and is able to reconstruct shapes even from a single image with an accuracy of 6mm. Results on 3 different datasets demonstrate the efficacy and accuracy of our approach
Unsupervised 3D Pose Estimation with Geometric Self-Supervision
We present an unsupervised learning approach to recover 3D human pose from 2D
skeletal joints extracted from a single image. Our method does not require any
multi-view image data, 3D skeletons, correspondences between 2D-3D points, or
use previously learned 3D priors during training. A lifting network accepts 2D
landmarks as inputs and generates a corresponding 3D skeleton estimate. During
training, the recovered 3D skeleton is reprojected on random camera viewpoints
to generate new "synthetic" 2D poses. By lifting the synthetic 2D poses back to
3D and re-projecting them in the original camera view, we can define
self-consistency loss both in 3D and in 2D. The training can thus be self
supervised by exploiting the geometric self-consistency of the
lift-reproject-lift process. We show that self-consistency alone is not
sufficient to generate realistic skeletons, however adding a 2D pose
discriminator enables the lifter to output valid 3D poses. Additionally, to
learn from 2D poses "in the wild", we train an unsupervised 2D domain adapter
network to allow for an expansion of 2D data. This improves results and
demonstrates the usefulness of 2D pose data for unsupervised 3D lifting.
Results on Human3.6M dataset for 3D human pose estimation demonstrate that our
approach improves upon the previous unsupervised methods by 30% and outperforms
many weakly supervised approaches that explicitly use 3D data
Unwind: Interactive Fish Straightening
The ScanAllFish project is a large-scale effort to scan all the world's
33,100 known species of fishes. It has already generated thousands of
volumetric CT scans of fish species which are available on open access
platforms such as the Open Science Framework. To achieve a scanning rate
required for a project of this magnitude, many specimens are grouped together
into a single tube and scanned all at once. The resulting data contain many
fish which are often bent and twisted to fit into the scanner. Our system,
Unwind, is a novel interactive visualization and processing tool which
extracts, unbends, and untwists volumetric images of fish with minimal user
interaction. Our approach enables scientists to interactively unwarp these
volumes to remove the undesired torque and bending using a piecewise-linear
skeleton extracted by averaging isosurfaces of a harmonic function connecting
the head and tail of each fish. The result is a volumetric dataset of a
individual, straight fish in a canonical pose defined by the marine biologist
expert user. We have developed Unwind in collaboration with a team of marine
biologists: Our system has been deployed in their labs, and is presently being
used for dataset construction, biomechanical analysis, and the generation of
figures for scientific publication
Capturing natural-colour 3D models of insects for species discovery
Collections of biological specimens are fundamental to scientific
understanding and characterization of natural diversity. This paper presents a
system for liberating useful information from physical collections by bringing
specimens into the digital domain so they can be more readily shared, analyzed,
annotated and compared. It focuses on insects and is strongly motivated by the
desire to accelerate and augment current practices in insect taxonomy which
predominantly use text, 2D diagrams and images to describe and characterize
species. While these traditional kinds of descriptions are informative and
useful, they cannot cover insect specimens "from all angles" and precious
specimens are still exchanged between researchers and collections for this
reason. Furthermore, insects can be complex in structure and pose many
challenges to computer vision systems. We present a new prototype for a
practical, cost-effective system of off-the-shelf components to acquire
natural-colour 3D models of insects from around 3mm to 30mm in length. Colour
images are captured from different angles and focal depths using a digital
single lens reflex (DSLR) camera rig and two-axis turntable. These 2D images
are processed into 3D reconstructions using software based on a visual hull
algorithm. The resulting models are compact (around 10 megabytes), afford
excellent optical resolution, and can be readily embedded into documents and
web pages, as well as viewed on mobile devices. The system is portable, safe,
relatively affordable, and complements the sort of volumetric data that can be
acquired by computed tomography. This system provides a new way to augment the
description and documentation of insect species holotypes, reducing the need to
handle or ship specimens. It opens up new opportunities to collect data for
research, education, art, entertainment, biodiversity assessment and
biosecurity control.Comment: 24 pages, 17 figures, PLOS ONE journa
- …