3,517 research outputs found
Deep learning systems as complex networks
Thanks to the availability of large scale digital datasets and massive
amounts of computational power, deep learning algorithms can learn
representations of data by exploiting multiple levels of abstraction. These
machine learning methods have greatly improved the state-of-the-art in many
challenging cognitive tasks, such as visual object recognition, speech
processing, natural language understanding and automatic translation. In
particular, one class of deep learning models, known as deep belief networks,
can discover intricate statistical structure in large data sets in a completely
unsupervised fashion, by learning a generative model of the data using
Hebbian-like learning mechanisms. Although these self-organizing systems can be
conveniently formalized within the framework of statistical mechanics, their
internal functioning remains opaque, because their emergent dynamics cannot be
solved analytically. In this article we propose to study deep belief networks
using techniques commonly employed in the study of complex networks, in order
to gain some insights into the structural and functional properties of the
computational graph resulting from the learning process.Comment: 20 pages, 9 figure
BinaryConnect: Training Deep Neural Networks with binary weights during propagations
Deep Neural Networks (DNN) have achieved state-of-the-art results in a wide
range of tasks, with the best results obtained with large training sets and
large models. In the past, GPUs enabled these breakthroughs because of their
greater computational speed. In the future, faster computation at both training
and test time is likely to be crucial for further progress and for consumer
applications on low-power devices. As a result, there is much interest in
research and development of dedicated hardware for Deep Learning (DL). Binary
weights, i.e., weights which are constrained to only two possible values (e.g.
-1 or 1), would bring great benefits to specialized DL hardware by replacing
many multiply-accumulate operations by simple accumulations, as multipliers are
the most space and power-hungry components of the digital implementation of
neural networks. We introduce BinaryConnect, a method which consists in
training a DNN with binary weights during the forward and backward
propagations, while retaining precision of the stored weights in which
gradients are accumulated. Like other dropout schemes, we show that
BinaryConnect acts as regularizer and we obtain near state-of-the-art results
with BinaryConnect on the permutation-invariant MNIST, CIFAR-10 and SVHN.Comment: Accepted at NIPS 2015, 9 pages, 3 figure
Building high-level features using large scale unsupervised learning
We consider the problem of building high-level, class-specific feature
detectors from only unlabeled data. For example, is it possible to learn a face
detector using only unlabeled images? To answer this, we train a 9-layered
locally connected sparse autoencoder with pooling and local contrast
normalization on a large dataset of images (the model has 1 billion
connections, the dataset has 10 million 200x200 pixel images downloaded from
the Internet). We train this network using model parallelism and asynchronous
SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to
what appears to be a widely-held intuition, our experimental results reveal
that it is possible to train a face detector without having to label images as
containing a face or not. Control experiments show that this feature detector
is robust not only to translation but also to scaling and out-of-plane
rotation. We also find that the same network is sensitive to other high-level
concepts such as cat faces and human bodies. Starting with these learned
features, we trained our network to obtain 15.8% accuracy in recognizing 20,000
object categories from ImageNet, a leap of 70% relative improvement over the
previous state-of-the-art
- …