2,797 research outputs found

    Error correcting code using tree-like multilayer perceptron

    Full text link
    An error correcting code using a tree-like multilayer perceptron is proposed. An original message \mbi{s}^0 is encoded into a codeword \boldmath{y}_0 using a tree-like committee machine (committee tree) or a tree-like parity machine (parity tree). Based on these architectures, several schemes featuring monotonic or non-monotonic units are introduced. The codeword \mbi{y}_0 is then transmitted via a Binary Asymmetric Channel (BAC) where it is corrupted by noise. The analytical performance of these schemes is investigated using the replica method of statistical mechanics. Under some specific conditions, some of the proposed schemes are shown to saturate the Shannon bound at the infinite codeword length limit. The influence of the monotonicity of the units on the performance is also discussed.Comment: 23 pages, 3 figures, Content has been extended and revise

    Compressing Deep Neural Networks via Knowledge Distillation

    Get PDF
    There has been a continuous evolution in deep neural network architectures since Alex Krizhevsky proposed AlexNet in 2012. Part of this has been due to increased complexity of the data and easier availability of datasets and part of it has been due to increased complexity of applications. These two factors form a self sustaining cycle and thereby have pushed the boundaries of deep learning to new domains in recent years. Many datasets have been proposed for different tasks. In computer vision, notable datasets like ImageNet, CIFAR-10, 100, MS-COCO provide large training data, with different tasks like classification, segmentation and object localization. Interdisciplinary datasets like the Visual Genome Dataset connect computer vision to tasks like natural language processing. All of these have fuelled the advent of architectures like AlexNet, VGG-Net, ResNet to achieve better predictive performance on these datasets. In object detection, networks like YOLO, SSD, Faster-RCNN have made great strides in achieving state of the art performance. However, amidst the growth of the neural networks one aspect that has been neglected is the problem of deploying them on devices which can support the computational and memory requirements of Deep Neural Networks (DNNs). Modern technology is only as good as the number of platforms it can support. Many applications like face detection, person classification and pedestrian detection require real time execution, with devices mounted on cameras. These devices are low powered and do not have the computational resources to run the data through a DNN and get instantaneous results. A natural solution to this problem is to make the DNN size smaller through compression. However, unlike file compression, DNN compression has a goal of not significantly impacting the overall accuracy of the network. In this thesis we consider the problem of model compression and present our end-to-end training algorithm for training a smaller model under the influence of a collection of expert models. The smaller model can be then deployed on resource constrained hardware independently from the expert models. We call this approach a form of compression since by deploying a smaller model we save the memory which would have been consumed by one or more expert models. We additionally introduce memory efficient architectures by building off from key ideas in literature that occupy very small memory and show the results of training them using our approach

    Machine learning in spectral domain

    Get PDF
    Deep neural networks are usually trained in the space of the nodes, by adjusting the weights of existing links via suitable optimization protocols. We here propose a radically new approach which anchors the learning process to reciprocal space. Specifically, the training acts on the spectral domain and seeks to modify the eigenvectors and eigenvalues of transfer operators in direct space. The proposed method is ductile and can be tailored to return either linear or non linear classifiers. The performance are competitive with standard schemes, while allowing for a significant reduction of the learning parameter space. Spectral learning restricted to eigenvalues could be also employed for pre-training of the deep neural network, in conjunction with conventional machine-learning schemes. Further, it is surmised that the nested indentation of eigenvectors that defines the core idea of spectral learning could help understanding why deep networks work as well as they do

    Artificial neural networks in geospatial analysis

    Full text link
    Artificial neural networks are computational models widely used in geospatial analysis for data classification, change detection, clustering, function approximation, and forecasting or prediction. There are many types of neural networks based on learning paradigm and network architectures. Their use is expected to grow with increasing availability of massive data from remote sensing and mobile platforms
    • …
    corecore