Search CORE

51 research outputs found

Predicting Parameters in Deep Learning

Author: de Freitas Nando
Denil Misha
Dinh Laurent
Ranzato Marc'Aurelio
Shakibi Babak
Publication venue
Publication date: 01/01/2013
Field of study

We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small number of weights and predicting the rest. In the best case we are able to predict more than 95% of the weights of a network without any drop in accuracy

arXiv.org e-Print Archive

CiteSeerX

Oxford University Research Archive

Building high-level features using large scale unsupervised learning

Author: Chen Kai
Corrado Greg S.
Dean Jeff
Devin Matthieu
Le Quoc V.
Monga Rajat
Ng Andrew Y.
Ranzato Marc'Aurelio
Publication venue
Publication date: 01/01/2012
Field of study

We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art

arXiv.org e-Print Archive

CiteSeerX

Score Function Features for Discriminative Learning

Author: Anandkumar Anima
Janzamin Majid
Sedghi Hanie
Publication venue
Publication date: 19/12/2014
Field of study

Feature learning forms the cornerstone for tackling challenging learning problems in domains such as speech, computer vision and natural language processing. In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples. We present efficient algorithms for extracting discriminative information, given these pre-trained features and labeled samples for any related task. Our class of features are based on higher-order score functions, which capture local variations in the probability density function of the input. We establish a theoretical framework to characterize the nature of discriminative information that can be extracted from score-function features, when used in conjunction with labeled samples. We employ efficient spectral decomposition algorithms (on matrices and tensors) for extracting discriminative components. The advantage of employing tensor-valued features is that we can extract richer discriminative information in the form of an overcomplete representations. Thus, we present a novel framework for employing generative models of the input for discriminative learning

arXiv.org e-Print Archive

eScholarship - University of California

Caltech Authors