3,337 research outputs found
Semantic Graph Convolutional Networks for 3D Human Pose Regression
In this paper, we study the problem of learning Graph Convolutional Networks
(GCNs) for regression. Current architectures of GCNs are limited to the small
receptive field of convolution filters and shared transformation matrix for
each node. To address these limitations, we propose Semantic Graph
Convolutional Networks (SemGCN), a novel neural network architecture that
operates on regression tasks with graph-structured data. SemGCN learns to
capture semantic information such as local and global node relationships, which
is not explicitly represented in the graph. These semantic relationships can be
learned through end-to-end training from the ground truth without additional
supervision or hand-crafted rules. We further investigate applying SemGCN to 3D
human pose regression. Our formulation is intuitive and sufficient since both
2D and 3D human poses can be represented as a structured graph encoding the
relationships between joints in the skeleton of a human body. We carry out
comprehensive studies to validate our method. The results prove that SemGCN
outperforms state of the art while using 90% fewer parameters.Comment: In CVPR 2019 (13 pages including supplementary material). The code
can be found at https://github.com/garyzhao/SemGC
Wasserstein Introspective Neural Networks
We present Wasserstein introspective neural networks (WINN) that are both a
generator and a discriminator within a single model. WINN provides a
significant improvement over the recent introspective neural networks (INN)
method by enhancing INN's generative modeling capability. WINN has three
interesting properties: (1) A mathematical connection between the formulation
of the INN algorithm and that of Wasserstein generative adversarial networks
(WGAN) is made. (2) The explicit adoption of the Wasserstein distance into INN
results in a large enhancement to INN, achieving compelling results even with a
single classifier --- e.g., providing nearly a 20 times reduction in model size
over INN for unsupervised generative modeling. (3) When applied to supervised
classification, WINN also gives rise to improved robustness against adversarial
examples in terms of the error reduction. In the experiments, we report
encouraging results on unsupervised learning problems including texture, face,
and object modeling, as well as a supervised classification task against
adversarial attacks.Comment: Accepted to CVPR 2018 (Oral
Articulated motion and deformable objects
This guest editorial introduces the twenty two papers accepted for this Special Issue on Articulated Motion and Deformable Objects (AMDO). They are grouped into four main categories within the field of AMDO: human motion analysis (action/gesture), human pose estimation, deformable shape segmentation, and face analysis. For each of the four topics, a survey of the recent developments in the field is presented. The accepted papers are briefly introduced in the context of this survey. They contribute novel methods, algorithms with improved performance as measured on benchmarking datasets, as well as two new datasets for hand action detection and human posture analysis. The special issue should be of high relevance to the reader interested in AMDO recognition and promote future research directions in the field
DeepPermNet: Visual Permutation Learning
We present a principled approach to uncover the structure of visual data by
solving a novel deep learning task coined visual permutation learning. The goal
of this task is to find the permutation that recovers the structure of data
from shuffled versions of it. In the case of natural images, this task boils
down to recovering the original image from patches shuffled by an unknown
permutation matrix. Unfortunately, permutation matrices are discrete, thereby
posing difficulties for gradient-based methods. To this end, we resort to a
continuous approximation of these matrices using doubly-stochastic matrices
which we generate from standard CNN predictions using Sinkhorn iterations.
Unrolling these iterations in a Sinkhorn network layer, we propose DeepPermNet,
an end-to-end CNN model for this task. The utility of DeepPermNet is
demonstrated on two challenging computer vision problems, namely, (i) relative
attributes learning and (ii) self-supervised representation learning. Our
results show state-of-the-art performance on the Public Figures and OSR
benchmarks for (i) and on the classification and segmentation tasks on the
PASCAL VOC dataset for (ii).Comment: Accepted in IEEE International Conference on Computer Vision and
Pattern Recognition CVPR 201
- …