15,276 research outputs found
GRASS: Generative Recursive Autoencoders for Shape Structures
We introduce a novel neural network architecture for encoding and synthesis
of 3D shapes, particularly their structures. Our key insight is that 3D shapes
are effectively characterized by their hierarchical organization of parts,
which reflects fundamental intra-shape relationships such as adjacency and
symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a
flat, unlabeled, arbitrary part layout to a compact code. The code effectively
captures hierarchical structures of man-made 3D objects of varying structural
complexities despite being fixed-dimensional: an associated decoder maps a code
back to a full hierarchy. The learned bidirectional mapping is further tuned
using an adversarial setup to yield a generative model of plausible structures,
from which novel structures can be sampled. Finally, our structure synthesis
framework is augmented by a second trained module that produces fine-grained
part geometry, conditioned on global and local structural context, leading to a
full generative pipeline for 3D shapes. We demonstrate that without
supervision, our network learns meaningful structural hierarchies adhering to
perceptual grouping principles, produces compact codes which enable
applications such as shape classification and partial matching, and supports
shape synthesis and interpolation with significant variations in topology and
geometry.Comment: Corresponding author: Kai Xu ([email protected]
Deep Learning on Lie Groups for Skeleton-based Action Recognition
In recent years, skeleton-based action recognition has become a popular 3D
classification problem. State-of-the-art methods typically first represent each
motion sequence as a high-dimensional trajectory on a Lie group with an
additional dynamic time warping, and then shallowly learn favorable Lie group
features. In this paper we incorporate the Lie group structure into a deep
network architecture to learn more appropriate Lie group features for 3D action
recognition. Within the network structure, we design rotation mapping layers to
transform the input Lie group features into desirable ones, which are aligned
better in the temporal domain. To reduce the high feature dimensionality, the
architecture is equipped with rotation pooling layers for the elements on the
Lie group. Furthermore, we propose a logarithm mapping layer to map the
resulting manifold data into a tangent space that facilitates the application
of regular output layers for the final classification. Evaluations of the
proposed network for standard 3D human action recognition datasets clearly
demonstrate its superiority over existing shallow Lie group feature learning
methods as well as most conventional deep learning methods.Comment: Accepted to CVPR 201
Practical recommendations for gradient-based training of deep architectures
Learning algorithms related to artificial neural networks and in particular
for Deep Learning may seem to involve many bells and whistles, called
hyper-parameters. This chapter is meant as a practical guide with
recommendations for some of the most commonly used hyper-parameters, in
particular in the context of learning algorithms based on back-propagated
gradient and gradient-based optimization. It also discusses how to deal with
the fact that more interesting results can be obtained when allowing one to
adjust many hyper-parameters. Overall, it describes elements of the practice
used to successfully and efficiently train and debug large-scale and often deep
multi-layer neural networks. It closes with open questions about the training
difficulties observed with deeper architectures
- …