Search CORE

83,110 research outputs found

Scalable Compression of Deep Neural Networks

Author: Chen W.
Denton E. L.
Han S.
Han S.
Kim Y.-D.
Krizhevsky A.
Krizhevsky A.
Taubman D.
Publication venue
Publication date: 26/08/2016
Field of study

Deep neural networks generally involve some layers with mil- lions of parameters, making them difficult to be deployed and updated on devices with limited resources such as mobile phones and other smart embedded systems. In this paper, we propose a scalable representation of the network parameters, so that different applications can select the most suitable bit rate of the network based on their own storage constraints. Moreover, when a device needs to upgrade to a high-rate network, the existing low-rate network can be reused, and only some incremental data are needed to be downloaded. We first hierarchically quantize the weights of a pre-trained deep neural network to enforce weight sharing. Next, we adaptively select the bits assigned to each layer given the total bit budget. After that, we retrain the network to fine-tune the quantized centroids. Experimental results show that our method can achieve scalable compression with graceful degradation in the performance.Comment: 5 pages, 4 figures, ACM Multimedia 201

arXiv.org e-Print Archive

Crossref

Decoding the Encoding of Functional Brain Networks: an fMRI Classification Comparison of Non-negative Matrix Factorization (NMF), Independent Component Analysis (ICA), and Sparse Coding Algorithms

Author: Anderson Ariana E.
Brody Arthur L.
Douglas Pamela K.
Wu Ying Nian
Xie Jianwen
Publication venue
Publication date: 01/07/2016
Field of study

Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet mathematical constraints such as sparse coding and positivity both provide alternate biologically-plausible frameworks for generating brain networks. Non-negative Matrix Factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms (

L1

Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks. The assumptions of independence, positivity, and sparsity to encode task-related brain networks are compared; the resulting brain networks for different constraints are used as basis functions to encode the observed functional activity at a given time point. These encodings are decoded using machine learning to compare both the algorithms and their assumptions, using the time series weights to predict whether a subject is viewing a video, listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects. For classifying cognitive activity, the sparse coding algorithm of

L1

Regularized Learning consistently outperformed 4 variations of ICA across different numbers of networks and noise levels (p

<

0.001). The NMF algorithms, which suppressed negative BOLD signal, had the poorest accuracy. Within each algorithm, encodings using sparser spatial networks (containing more zero-valued voxels) had higher classification accuracy (p

<

0.001). The success of sparse coding algorithms may suggest that algorithms which enforce sparse coding, discourage multitasking, and promote local specialization may capture better the underlying source processes than those which allow inexhaustible local processes such as ICA

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Demystifying the Scaling Laws of Dense Wireless Networks: No Linear Scaling in Practice

Author: Caire Giuseppe
Hong Song-Nam
Publication venue
Publication date: 25/04/2014
Field of study

We optimize the hierarchical cooperation protocol of Ozgur, Leveque and Tse, which is supposed to yield almost linear scaling of the capacity of a dense wireless network with the number of users

n

. Exploiting recent results on the optimality of "treating interference as noise" in Gaussian interference channels, we are able to optimize the achievable average per-link rate and not just its scaling law. Our optimized hierarchical cooperation protocol significantly outperforms the originally proposed scheme. On the negative side, we show that even for very large

n

, the rate scaling is far from linear, and the optimal number of stages

t

is less than 4, instead of

t \rightarrow \infty

as required for almost linear scaling. Combining our results and the fact that, beyond a certain user density, the network capacity is fundamentally limited by Maxwell laws, as shown by Francheschetti, Migliore and Minero, we argue that there is indeed no intermediate regime of linear scaling for dense networks in practice.Comment: 5 pages, 6 figures, ISIT 2014. arXiv admin note: substantial text overlap with arXiv:1402.181

arXiv.org e-Print Archive

Crossref