48,253 research outputs found
Straight to Shapes: Real-time Detection of Encoded Shapes
Current object detection approaches predict bounding boxes, but these provide
little instance-specific information beyond location, scale and aspect ratio.
In this work, we propose to directly regress to objects' shapes in addition to
their bounding boxes and categories. It is crucial to find an appropriate shape
representation that is compact and decodable, and in which objects can be
compared for higher-order concepts such as view similarity, pose variation and
occlusion. To achieve this, we use a denoising convolutional auto-encoder to
establish an embedding space, and place the decoder after a fast end-to-end
network trained to regress directly to the encoded shape vectors. This yields
what to the best of our knowledge is the first real-time shape prediction
network, running at ~35 FPS on a high-end desktop. With higher-order shape
reasoning well-integrated into the network pipeline, the network shows the
useful practical quality of generalising to unseen categories similar to the
ones in the training set, something that most existing approaches fail to
handle.Comment: 16 pages including appendix; Published at CVPR 201
Visualization of AE's Training on Credit Card Transactions with Persistent Homology
Auto-encoders are among the most popular neural network architecture for
dimension reduction. They are composed of two parts: the encoder which maps the
model distribution to a latent manifold and the decoder which maps the latent
manifold to a reconstructed distribution. However, auto-encoders are known to
provoke chaotically scattered data distribution in the latent manifold
resulting in an incomplete reconstructed distribution. Current distance
measures fail to detect this problem because they are not able to acknowledge
the shape of the data manifolds, i.e. their topological features, and the scale
at which the manifolds should be analyzed. We propose Persistent Homology for
Wasserstein Auto-Encoders, called PHom-WAE, a new methodology to assess and
measure the data distribution of a generative model. PHom-WAE minimizes the
Wasserstein distance between the true distribution and the reconstructed
distribution and uses persistent homology, the study of the topological
features of a space at different spatial resolutions, to compare the nature of
the latent manifold and the reconstructed distribution. Our experiments
underline the potential of persistent homology for Wasserstein Auto-Encoders in
comparison to Variational Auto-Encoders, another type of generative model. The
experiments are conducted on a real-world data set particularly challenging for
traditional distance measures and auto-encoders. PHom-WAE is the first
methodology to propose a topological distance measure, the bottleneck distance,
for Wasserstein Auto-Encoders used to compare decoded samples of high quality
in the context of credit card transactions.Comment: arXiv admin note: substantial text overlap with arXiv:1905.0989
Learning to Dress {3D} People in Generative Clothing
Three-dimensional human body models are widely used in the analysis of human
pose and motion. Existing models, however, are learned from minimally-clothed
3D scans and thus do not generalize to the complexity of dressed people in
common images and videos. Additionally, current models lack the expressive
power needed to represent the complex non-linear geometry of pose-dependent
clothing shapes. To address this, we learn a generative 3D mesh model of
clothed people from 3D scans with varying pose and clothing. Specifically, we
train a conditional Mesh-VAE-GAN to learn the clothing deformation from the
SMPL body model, making clothing an additional term in SMPL. Our model is
conditioned on both pose and clothing type, giving the ability to draw samples
of clothing to dress different body shapes in a variety of styles and poses. To
preserve wrinkle detail, our Mesh-VAE-GAN extends patchwise discriminators to
3D meshes. Our model, named CAPE, represents global shape and fine local
structure, effectively extending the SMPL body model to clothing. To our
knowledge, this is the first generative model that directly dresses 3D human
body meshes and generalizes to different poses. The model, code and data are
available for research purposes at https://cape.is.tue.mpg.de.Comment: CVPR-2020 camera ready. Code and data are available at
https://cape.is.tue.mpg.d
- …