2,797 research outputs found
Error correcting code using tree-like multilayer perceptron
An error correcting code using a tree-like multilayer perceptron is proposed.
An original message \mbi{s}^0 is encoded into a codeword \boldmath{y}_0
using a tree-like committee machine (committee tree) or a tree-like parity
machine (parity tree). Based on these architectures, several schemes featuring
monotonic or non-monotonic units are introduced. The codeword \mbi{y}_0 is
then transmitted via a Binary Asymmetric Channel (BAC) where it is corrupted by
noise. The analytical performance of these schemes is investigated using the
replica method of statistical mechanics. Under some specific conditions, some
of the proposed schemes are shown to saturate the Shannon bound at the infinite
codeword length limit. The influence of the monotonicity of the units on the
performance is also discussed.Comment: 23 pages, 3 figures, Content has been extended and revise
Compressing Deep Neural Networks via Knowledge Distillation
There has been a continuous evolution in deep neural network architectures since Alex Krizhevsky proposed AlexNet in 2012. Part of this has been due to increased complexity of the data and easier availability of datasets and part of it has been due to increased complexity of applications. These two factors form a self sustaining cycle and thereby have pushed the boundaries of deep learning to new domains in recent years.
Many datasets have been proposed for different tasks. In computer vision, notable datasets like ImageNet, CIFAR-10, 100, MS-COCO provide large training data, with different tasks like classification, segmentation and object localization. Interdisciplinary datasets like the Visual Genome Dataset connect computer vision to tasks like natural language processing. All of these have fuelled the advent of architectures like AlexNet, VGG-Net, ResNet to achieve better predictive performance on these datasets. In object detection, networks like YOLO, SSD, Faster-RCNN have made great strides in achieving state of the art performance.
However, amidst the growth of the neural networks one aspect that has been neglected is the problem of deploying them on devices which can support the computational and memory requirements of Deep Neural Networks (DNNs). Modern technology is only as good as the number of platforms it can support. Many applications like face detection, person classification and pedestrian detection require real time execution, with devices mounted on cameras. These devices are low powered and do not have the computational resources to run the data through a DNN and get instantaneous results. A natural solution to this problem is to make the DNN size smaller through compression. However, unlike file compression, DNN compression has a goal of not significantly impacting the overall accuracy of the network.
In this thesis we consider the problem of model compression and present our end-to-end training algorithm for training a smaller model under the influence of a collection of expert models. The smaller model can be then deployed on resource constrained hardware independently from the expert models. We call this approach a form of compression since by deploying a smaller model we save the memory which would have been consumed by one or more expert models. We additionally introduce memory efficient architectures by building off from key ideas in literature that occupy very small memory and show the results of training them using our approach
Machine learning in spectral domain
Deep neural networks are usually trained in the space of the nodes, by
adjusting the weights of existing links via suitable optimization protocols. We
here propose a radically new approach which anchors the learning process to
reciprocal space. Specifically, the training acts on the spectral domain and
seeks to modify the eigenvectors and eigenvalues of transfer operators in
direct space. The proposed method is ductile and can be tailored to return
either linear or non linear classifiers. The performance are competitive with
standard schemes, while allowing for a significant reduction of the learning
parameter space. Spectral learning restricted to eigenvalues could be also
employed for pre-training of the deep neural network, in conjunction with
conventional machine-learning schemes. Further, it is surmised that the nested
indentation of eigenvectors that defines the core idea of spectral learning
could help understanding why deep networks work as well as they do
Artificial neural networks in geospatial analysis
Artificial neural networks are computational models widely used in geospatial analysis for data classification, change detection, clustering, function approximation, and forecasting or prediction. There are many types of neural networks based on learning paradigm and network architectures. Their use is expected to grow with increasing availability of massive data from remote sensing and mobile platforms
- …