96,080 research outputs found
From Maxout to Channel-Out: Encoding Information on Sparse Pathways
Motivated by an important insight from neural science, we propose a new
framework for understanding the success of the recently proposed "maxout"
networks. The framework is based on encoding information on sparse pathways and
recognizing the correct pathway at inference time. Elaborating further on this
insight, we propose a novel deep network architecture, called "channel-out"
network, which takes a much better advantage of sparse pathway encoding. In
channel-out networks, pathways are not only formed a posteriori, but they are
also actively selected according to the inference outputs from the lower
layers. From a mathematical perspective, channel-out networks can represent a
wider class of piece-wise continuous functions, thereby endowing the network
with more expressive power than that of maxout networks. We test our
channel-out networks on several well-known image classification benchmarks,
setting new state-of-the-art performance on CIFAR-100 and STL-10, which
represent some of the "harder" image classification benchmarks.Comment: 10 pages including the appendix, 9 figure
Improving EEG-based driver fatigue classification using sparse-deep belief networks
© 2017 Chai, Ling, San, Naik, Nguyen, Tran, Craig and Nguyen. This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively
A linear approach for sparse coding by a two-layer neural network
Many approaches to transform classification problems from non-linear to
linear by feature transformation have been recently presented in the
literature. These notably include sparse coding methods and deep neural
networks. However, many of these approaches require the repeated application of
a learning process upon the presentation of unseen data input vectors, or else
involve the use of large numbers of parameters and hyper-parameters, which must
be chosen through cross-validation, thus increasing running time dramatically.
In this paper, we propose and experimentally investigate a new approach for the
purpose of overcoming limitations of both kinds. The proposed approach makes
use of a linear auto-associative network (called SCNN) with just one hidden
layer. The combination of this architecture with a specific error function to
be minimized enables one to learn a linear encoder computing a sparse code
which turns out to be as similar as possible to the sparse coding that one
obtains by re-training the neural network. Importantly, the linearity of SCNN
and the choice of the error function allow one to achieve reduced running time
in the learning phase. The proposed architecture is evaluated on the basis of
two standard machine learning tasks. Its performances are compared with those
of recently proposed non-linear auto-associative neural networks. The overall
results suggest that linear encoders can be profitably used to obtain sparse
data representations in the context of machine learning problems, provided that
an appropriate error function is used during the learning phase
A Unified Approximation Framework for Compressing and Accelerating Deep Neural Networks
Deep neural networks (DNNs) have achieved significant success in a variety of
real world applications, i.e., image classification. However, tons of
parameters in the networks restrict the efficiency of neural networks due to
the large model size and the intensive computation. To address this issue,
various approximation techniques have been investigated, which seek for a light
weighted network with little performance degradation in exchange of smaller
model size or faster inference. Both low-rankness and sparsity are appealing
properties for the network approximation. In this paper we propose a unified
framework to compress the convolutional neural networks (CNNs) by combining
these two properties, while taking the nonlinear activation into consideration.
Each layer in the network is approximated by the sum of a structured sparse
component and a low-rank component, which is formulated as an optimization
problem. Then, an extended version of alternating direction method of
multipliers (ADMM) with guaranteed convergence is presented to solve the
relaxed optimization problem. Experiments are carried out on VGG-16, AlexNet
and GoogLeNet with large image classification datasets. The results outperform
previous work in terms of accuracy degradation, compression rate and speedup
ratio. The proposed method is able to remarkably compress the model (with up to
4.9x reduction of parameters) at a cost of little loss or without loss on
accuracy.Comment: 8 pages, 5 figures, 6 table
- …