1,428 research outputs found
Variational Autoencoders for Deforming 3D Mesh Models
3D geometric contents are becoming increasingly popular. In this paper, we
study the problem of analyzing deforming 3D meshes using deep neural networks.
Deforming 3D meshes are flexible to represent 3D animation sequences as well as
collections of objects of the same category, allowing diverse shapes with
large-scale non-linear deformations. We propose a novel framework which we call
mesh variational autoencoders (mesh VAE), to explore the probabilistic latent
space of 3D surfaces. The framework is easy to train, and requires very few
training examples. We also propose an extended model which allows flexibly
adjusting the significance of different latent variables by altering the prior
distribution. Extensive experiments demonstrate that our general framework is
able to learn a reasonable representation for a collection of deformable
shapes, and produce competitive results for a variety of applications,
including shape generation, shape interpolation, shape space embedding and
shape exploration, outperforming state-of-the-art methods.Comment: CVPR 201
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Learning Representations for Face Recognition: A Review from Holistic to Deep Learning
For decades, researchers have investigated how to recognize facial images. This study reviews the development of different face recognition (FR) methods, namely, holistic learning, handcrafted local feature learning, shallow learning, and deep learning (DL). With the development of methods, the accuracy of recognizing faces in the labeled faces in the wild (LFW) database has been increased. The accuracy of holistic learning is 60%, that of handcrafted local feature learning increases to 70%, and that of shallow learning is 86%. Finally, DL achieves human-level performance (97% accuracy). This enhanced accuracy is caused by large datasets and graphics processing units (GPUs) with massively parallel processing capabilities. Furthermore, FR challenges and current research studies are discussed to understand future research directions. The results of this study show that presently the database of labeled faces in the wild has reached 99.85% accuracy
Interpretable Transformations with Encoder-Decoder Networks
Deep feature spaces have the capacity to encode complex transformations of
their input data. However, understanding the relative feature-space
relationship between two transformed encoded images is difficult. For instance,
what is the relative feature space relationship between two rotated images?
What is decoded when we interpolate in feature space? Ideally, we want to
disentangle confounding factors, such as pose, appearance, and illumination,
from object identity. Disentangling these is difficult because they interact in
very nonlinear ways. We propose a simple method to construct a deep feature
space, with explicitly disentangled representations of several known
transformations. A person or algorithm can then manipulate the disentangled
representation, for example, to re-render an image with explicit control over
parameterized degrees of freedom. The feature space is constructed using a
transforming encoder-decoder network with a custom feature transform layer,
acting on the hidden representations. We demonstrate the advantages of explicit
disentangling on a variety of datasets and transformations, and as an aid for
traditional tasks, such as classification.Comment: Accepted at ICCV 201
From neural PCA to deep unsupervised learning
A network supporting deep unsupervised learning is presented. The network is
an autoencoder with lateral shortcut connections from the encoder to decoder at
each level of the hierarchy. The lateral shortcut connections allow the higher
levels of the hierarchy to focus on abstract invariant features. While standard
autoencoders are analogous to latent variable models with a single layer of
stochastic variables, the proposed network is analogous to hierarchical latent
variables models. Learning combines denoising autoencoder and denoising sources
separation frameworks. Each layer of the network contributes to the cost
function a term which measures the distance of the representations produced by
the encoder and the decoder. Since training signals originate from all levels
of the network, all layers can learn efficiently even in deep networks. The
speedup offered by cost terms from higher levels of the hierarchy and the
ability to learn invariant features are demonstrated in experiments.Comment: A revised version of an article that has been accepted for
publication in Advances in Independent Component Analysis and Learning
Machines (2015), edited by Ella Bingham, Samuel Kaski, Jorma Laaksonen and
Jouko Lampine
Optimal Transport for Domain Adaptation
Domain adaptation from one data space (or domain) to another is one of the
most challenging tasks of modern data analytics. If the adaptation is done
correctly, models built on a specific data space become more robust when
confronted to data depicting the same semantic concepts (the classes), but
observed by another observation system with its own specificities. Among the
many strategies proposed to adapt a domain to another, finding a common
representation has shown excellent properties: by finding a common
representation for both domains, a single classifier can be effective in both
and use labelled samples from the source domain to predict the unlabelled
samples of the target domain. In this paper, we propose a regularized
unsupervised optimal transportation model to perform the alignment of the
representations in the source and target domains. We learn a transportation
plan matching both PDFs, which constrains labelled samples in the source domain
to remain close during transport. This way, we exploit at the same time the few
labeled information in the source and the unlabelled distributions observed in
both domains. Experiments in toy and challenging real visual adaptation
examples show the interest of the method, that consistently outperforms state
of the art approaches
- …