Search CORE

36,387 research outputs found

Новые клеточные методы умножения матриц

Author: Елфимова Л.Д.
Publication venue: Інститут кібернетики ім. В.М. Глушкова НАН України
Publication date: 01/01/2013
Field of study

Запропоновано два нових клітинних методи множення матриць, які дозволяють отримати клітинні аналоги відомих алгоритмів матричного множення зі зменшеною обчислювальною складністю, порівняно з аналогами, отриманими на основі відомих клітинних методів множення матриць. Новий швидкий клітинний метод дозволяє мінімізувати на 15% мультиплікативну, адитивну і загальну складність відомих алгоритмів матричного множення. Новий змішаний клітинний метод поєднує метод Лейдермана із запропонованим швидким клітинним методом, що призводить до мінімізації на 28% мультиплікативної, адитивної і загальної складності зазначених алгоритмів. Оцінки обчислювальної складності цих методів подано на прикладі отримання клітинних аналогів традиційного алгоритму множення матриць.The paper proposes two new cellular methods of matrix multiplication, which allow obtaining cellular analogs of the well-known matrix multiplication algorithms with reduced computational complexity, as compared with the analogs derived on the basis of the well-known cellular methods of matrix multiplication. The new fast cellular method reduces by 15% the multiplicative, additive, and overall complexities of the mentioned algorithms. The new mixed cellular method combines the Laderman method with the proposed fast cellular method. The interaction of these methods reduces by 28% the multiplicative, additive, and overall complexities of the matrix multiplication algorithms. The computational complexity of these methods are estimated using the model of getting cellular analogs of the traditional matrix multiplication algorithm

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)

Recovery from Linear Measurements with Complexity-Matching Universal Signal Estimation

Author: Baron Dror
Duarte Marco F.
Zhu Junan
Publication venue
Publication date: 21/12/2014
Field of study

We study the compressed sensing (CS) signal estimation problem where an input signal is measured via a linear matrix multiplication under additive noise. While this setup usually assumes sparsity or compressibility in the input signal during recovery, the signal structure that can be leveraged is often not known a priori. In this paper, we consider universal CS recovery, where the statistics of a stationary ergodic signal source are estimated simultaneously with the signal itself. Inspired by Kolmogorov complexity and minimum description length, we focus on a maximum a posteriori (MAP) estimation framework that leverages universal priors to match the complexity of the source. Our framework can also be applied to general linear inverse problems where more measurements than in CS might be needed. We provide theoretical results that support the algorithmic feasibility of universal MAP estimation using a Markov chain Monte Carlo implementation, which is computationally challenging. We incorporate some techniques to accelerate the algorithm while providing comparable and in many cases better reconstruction quality than existing algorithms. Experimental results show the promise of universality in CS, particularly for low-complexity sources that do not exhibit standard sparsity or compressibility.Comment: 29 pages, 8 figure

arXiv.org e-Print Archive

Crossref

ScholarWorks@UMass Amherst

SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications

Author: Khan Fahad Shahbaz
Khan Salman
Maaz Muhammad
Rasheed Hanoona
Shaker Abdelrahman
Yang Ming-Hsuan
Publication venue
Publication date: 27/03/2023
Field of study

Self-attention has become a defacto choice for capturing global context in various vision applications. However, its quadratic computational complexity with respect to image resolution limits its use in real-time applications, especially for deployment on resource-constrained mobile devices. Although hybrid approaches have been proposed to combine the advantages of convolutions and self-attention for a better speed-accuracy trade-off, the expensive matrix multiplication operations in self-attention remain a bottleneck. In this work, we introduce a novel efficient additive attention mechanism that effectively replaces the quadratic matrix multiplication operations with linear element-wise multiplications. Our design shows that the key-value interaction can be replaced with a linear layer without sacrificing any accuracy. Unlike previous state-of-the-art methods, our efficient formulation of self-attention enables its usage at all stages of the network. Using our proposed efficient additive attention, we build a series of models called "SwiftFormer" which achieves state-of-the-art performance in terms of both accuracy and mobile inference speed. Our small variant achieves 78.5% top-1 ImageNet-1K accuracy with only 0.8 ms latency on iPhone 14, which is more accurate and 2x faster compared to MobileViT-v2. Code: https://github.com/Amshaker/SwiftFormerComment: Technical repor

arXiv.org e-Print Archive

Time for dithering: fast and quantized random embeddings via the restricted isometry property

Author: Cambareri Valerio
Jacques Laurent
Publication venue
Publication date: 28/12/2016
Field of study

Recently, many works have focused on the characterization of non-linear dimensionality reduction methods obtained by quantizing linear embeddings, e.g., to reach fast processing time, efficient data compression procedures, novel geometry-preserving embeddings or to estimate the information/bits stored in this reduced data representation. In this work, we prove that many linear maps known to respect the restricted isometry property (RIP) can induce a quantized random embedding with controllable multiplicative and additive distortions with respect to the pairwise distances of the data points beings considered. In other words, linear matrices having fast matrix-vector multiplication algorithms (e.g., based on partial Fourier ensembles or on the adjacency matrix of unbalanced expanders) can be readily used in the definition of fast quantized embeddings with small distortions. This implication is made possible by applying right after the linear map an additive and random "dither" that stabilizes the impact of the uniform scalar quantization operator applied afterwards. For different categories of RIP matrices, i.e., for different linear embeddings of a metric space

(\mathcal K \subset \mathbb R^n, \ell_q)

(\mathbb R^m, \ell_p)

with

p,q \geq 1

, we derive upper bounds on the additive distortion induced by quantization, showing that it decays either when the embedding dimension

m

increases or when the distance of a pair of embedded vectors in

\mathcal K

decreases. Finally, we develop a novel "bi-dithered" quantization scheme, which allows for a reduced distortion that decreases when the embedding dimension grows and independently of the considered pair of vectors.Comment: Keywords: random projections, non-linear embeddings, quantization, dither, restricted isometry property, dimensionality reduction, compressive sensing, low-complexity signal models, fast and structured sensing matrices, quantized rank-one projections (31 pages

arXiv.org e-Print Archive

Crossref

DIAL UCLouvain

Новые быстрые гибридные алгоритмы умножения матриц

Author: Елфимова Л.Д.
Publication venue: Інститут кібернетики ім. В.М. Глушкова НАН України
Publication date: 01/01/2011
Field of study

Запропоновано новi гiбриднi алгоритми множення (n x n)-матриць, при побудові яких використано алгоритм Лейдермана для множення (3 x 3)-матриць. Порівняно з відомими гібридними алгоритмами множення матриць нові алгоритми характеризуються мінімізованою обчислюваною складністю. Наведено оцінки мультиплікативної, адитивної та загальної складності в представлених алгоритмах.New hybrid algorithms are proposed for multiplying (n x n)-matrices. They are based on Laderman’s algorithm for multiplying (3 x 3)-matrices. As compared with the well-known hybrid matrix multiplication algorithms, the new algorithms are characterized by the minimum computational complexity. The multiplicative, additive, and overall complexities of the algorithms are estimated

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)

Composite Cyclotomic Fourier Transforms with Reduced Complexities

Author: Chen Ning
Wagh Meghanad
Wang Ying
Wu Xuebin
Yan Zhiyuan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/12/2010
Field of study

Discrete Fourier transforms~(DFTs) over finite fields have widespread applications in digital communication and storage systems. Hence, reducing the computational complexities of DFTs is of great significance. Recently proposed cyclotomic fast Fourier transforms (CFFTs) are promising due to their low multiplicative complexities. Unfortunately, there are two issues with CFFTs: (1) they rely on efficient short cyclic convolution algorithms, which has not been investigated thoroughly yet, and (2) they have very high additive complexities when directly implemented. In this paper, we address both issues. One of the main contributions of this paper is efficient bilinear 11-point cyclic convolution algorithms, which allow us to construct CFFTs over GF

(2^{11})

. The other main contribution of this paper is that we propose composite cyclotomic Fourier transforms (CCFTs). In comparison to previously proposed fast Fourier transforms, our CCFTs achieve lower overall complexities for moderate to long lengths, and the improvement significantly increases as the length grows. Our 2047-point and 4095-point CCFTs are also first efficient DFTs of such lengths to the best of our knowledge. Finally, our CCFTs are also advantageous for hardware implementations due to their regular and modular structure.Comment: submitted to IEEE trans on Signal Processin

arXiv.org e-Print Archive

Crossref

On fast multiplication of a matrix by its transpose

Author: Dumas Jean-Guillaume
Pernet Clement
Sedoglavic Alexandre
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/06/2020
Field of study

We present a non-commutative algorithm for the multiplication of a 2x2-block-matrix by its transpose using 5 block products (3 recursive calls and 2 general products) over C or any finite field.We use geometric considerations on the space of bilinear forms describing 2x2 matrix products to obtain this algorithm and we show how to reduce the number of involved additions.The resulting algorithm for arbitrary dimensions is a reduction of multiplication of a matrix by its transpose to general matrix product, improving by a constant factor previously known reductions.Finally we propose schedules with low memory footprint that support a fast and memory efficient practical implementation over a finite field.To conclude, we show how to use our result in LDLT factorization.Comment: ISSAC 2020, Jul 2020, Kalamata, Greec

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server