904 research outputs found
Network Sketching: Exploiting Binary Structure in Deep CNNs
Convolutional neural networks (CNNs) with deep architectures have
substantially advanced the state-of-the-art in computer vision tasks. However,
deep networks are typically resource-intensive and thus difficult to be
deployed on mobile devices. Recently, CNNs with binary weights have shown
compelling efficiency to the community, whereas the accuracy of such models is
usually unsatisfactory in practice. In this paper, we introduce network
sketching as a novel technique of pursuing binary-weight CNNs, targeting at
more faithful inference and better trade-off for practical applications. Our
basic idea is to exploit binary structure directly in pre-trained filter banks
and produce binary-weight models via tensor expansion. The whole process can be
treated as a coarse-to-fine model approximation, akin to the pencil drawing
steps of outlining and shading. To further speedup the generated models, namely
the sketches, we also propose an associative implementation of binary tensor
convolutions. Experimental results demonstrate that a proper sketch of AlexNet
(or ResNet) outperforms the existing binary-weight models by large margins on
the ImageNet large scale classification task, while the committed memory for
network parameters only exceeds a little.Comment: To appear in CVPR201
DeepSketch2Face: A Deep Learning Based Sketching System for 3D Face and Caricature Modeling
Face modeling has been paid much attention in the field of visual computing.
There exist many scenarios, including cartoon characters, avatars for social
media, 3D face caricatures as well as face-related art and design, where
low-cost interactive face modeling is a popular approach especially among
amateur users. In this paper, we propose a deep learning based sketching system
for 3D face and caricature modeling. This system has a labor-efficient
sketching interface, that allows the user to draw freehand imprecise yet
expressive 2D lines representing the contours of facial features. A novel CNN
based deep regression network is designed for inferring 3D face models from 2D
sketches. Our network fuses both CNN and shape based features of the input
sketch, and has two independent branches of fully connected layers generating
independent subsets of coefficients for a bilinear face representation. Our
system also supports gesture based interactions for users to further manipulate
initial face models. Both user studies and numerical results indicate that our
sketching system can help users create face models quickly and effectively. A
significantly expanded face database with diverse identities, expressions and
levels of exaggeration is constructed to promote further research and
evaluation of face modeling techniques.Comment: 12 pages, 16 figures, to appear in SIGGRAPH 201
Training Input-Output Recurrent Neural Networks through Spectral Methods
We consider the problem of training input-output recurrent neural networks
(RNN) for sequence labeling tasks. We propose a novel spectral approach for
learning the network parameters. It is based on decomposition of the
cross-moment tensor between the output and a non-linear transformation of the
input, based on score functions. We guarantee consistent learning with
polynomial sample and computational complexity under transparent conditions
such as non-degeneracy of model parameters, polynomial activations for the
neurons, and a Markovian evolution of the input sequence. We also extend our
results to Bidirectional RNN which uses both previous and future information to
output the label at each time point, and is employed in many NLP tasks such as
POS tagging
Stochastically Rank-Regularized Tensor Regression Networks
Over-parametrization of deep neural networks has recently been shown to be key to their successful training. However, it also renders them prone to overfitting and makes them expensive to store and train. Tensor regression networks significantly reduce the number of effective parameters in deep neural networks while retaining accuracy and the ease of training. They replace the flattening and fully-connected layers with a tensor regression layer, where the regression weights are expressed through the factors of a low-rank tensor decomposition. In this paper, to further improve tensor regression networks, we propose a novel stochastic rank-regularization. It consists of a novel randomized tensor sketching method to approximate the weights of tensor regression layers. We theoretically and empirically establish the link between our proposed stochastic rank-regularization and the dropout on low-rank tensor regression. Extensive experimental results with both synthetic data and real world datasets (i.e., CIFAR-100 and the UK Biobank brain MRI dataset) support that the proposed approach i) improves performance in both classification and regression tasks, ii) decreases overfitting, iii) leads to more stable training and iv) improves robustness to adversarial attacks and random noise
Robust Deep Networks with Randomized Tensor Regression Layers
In this paper, we propose a novel randomized tensor decomposition for tensor regression. It allows to stochastically approximate the weights of tensor regression layers by randomly sampling in the low-rank subspace. We theoretically and empirically establish the link between our proposed stochastic rank-regularization and the dropout on low-rank tensor regression. This acts as an additional stochastic regularization on the regression weight, which, combined with the deterministic regularization imposed by the low-rank constraint, improves both the performance and robustness of neural networks augmented with it. In particular, it makes the model more robust to adversarial attacks and random noise, without requiring any adversarial training. We perform a thorough study of our method on synthetic data, object classification on the CIFAR100 and ImageNet datasets, and large scale brain-age prediction on UK Biobank brain MRI dataset. We demonstrate superior performance in all cases, as well as significant improvement in robustness to adversarial attacks and random noise
- …