179 research outputs found
Tensor Decomposition for Model Reduction in Neural Networks: A Review
Modern neural networks have revolutionized the fields of computer vision (CV)
and Natural Language Processing (NLP). They are widely used for solving complex
CV tasks and NLP tasks such as image classification, image generation, and
machine translation. Most state-of-the-art neural networks are
over-parameterized and require a high computational cost. One straightforward
solution is to replace the layers of the networks with their low-rank tensor
approximations using different tensor decomposition methods. This paper reviews
six tensor decomposition methods and illustrates their ability to compress
model parameters of convolutional neural networks (CNNs), recurrent neural
networks (RNNs) and Transformers. The accuracy of some compressed models can be
higher than the original versions. Evaluations indicate that tensor
decompositions can achieve significant reductions in model size, run-time and
energy consumption, and are well suited for implementing neural networks on
edge devices.Comment: IEEE Circuits and Systems Magazine, 202
Enhancing Deep Learning Models through Tensorization: A Comprehensive Survey and Framework
The burgeoning growth of public domain data and the increasing complexity of
deep learning model architectures have underscored the need for more efficient
data representation and analysis techniques. This paper is motivated by the
work of (Helal, 2023) and aims to present a comprehensive overview of
tensorization. This transformative approach bridges the gap between the
inherently multidimensional nature of data and the simplified 2-dimensional
matrices commonly used in linear algebra-based machine learning algorithms.
This paper explores the steps involved in tensorization, multidimensional data
sources, various multiway analysis methods employed, and the benefits of these
approaches. A small example of Blind Source Separation (BSS) is presented
comparing 2-dimensional algorithms and a multiway algorithm in Python. Results
indicate that multiway analysis is more expressive. Contrary to the intuition
of the dimensionality curse, utilising multidimensional datasets in their
native form and applying multiway analysis methods grounded in multilinear
algebra reveal a profound capacity to capture intricate interrelationships
among various dimensions while, surprisingly, reducing the number of model
parameters and accelerating processing. A survey of the multi-away analysis
methods and integration with various Deep Neural Networks models is presented
using case studies in different application domains.Comment: 34 pages, 8 figures, 4 table
Low Rank Optimization for Efficient Deep Learning: Making A Balance between Compact Architecture and Fast Training
Deep neural networks have achieved great success in many data processing
applications. However, the high computational complexity and storage cost makes
deep learning hard to be used on resource-constrained devices, and it is not
environmental-friendly with much power cost. In this paper, we focus on
low-rank optimization for efficient deep learning techniques. In the space
domain, deep neural networks are compressed by low rank approximation of the
network parameters, which directly reduces the storage requirement with a
smaller number of network parameters. In the time domain, the network
parameters can be trained in a few subspaces, which enables efficient training
for fast convergence. The model compression in the spatial domain is summarized
into three categories as pre-train, pre-set, and compression-aware methods,
respectively. With a series of integrable techniques discussed, such as sparse
pruning, quantization, and entropy coding, we can ensemble them in an
integration framework with lower computational complexity and storage. Besides
of summary of recent technical advances, we have two findings for motivating
future works: one is that the effective rank outperforms other sparse measures
for network compression. The other is a spatial and temporal balance for
tensorized neural networks
- …