36,387 research outputs found

    НовыС ΠΊΠ»Π΅Ρ‚ΠΎΡ‡Π½Ρ‹Π΅ ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ‹ умноТСния ΠΌΠ°Ρ‚Ρ€ΠΈΡ†

    No full text
    Π—Π°ΠΏΡ€ΠΎΠΏΠΎΠ½ΠΎΠ²Π°Π½ΠΎ Π΄Π²Π° Π½ΠΎΠ²ΠΈΡ… ΠΊΠ»Ρ–Ρ‚ΠΈΠ½Π½ΠΈΡ… ΠΌΠ΅Ρ‚ΠΎΠ΄ΠΈ мноТСння ΠΌΠ°Ρ‚Ρ€ΠΈΡ†ΡŒ, які Π΄ΠΎΠ·Π²ΠΎΠ»ΡΡŽΡ‚ΡŒ ΠΎΡ‚Ρ€ΠΈΠΌΠ°Ρ‚ΠΈ ΠΊΠ»Ρ–Ρ‚ΠΈΠ½Π½Ρ– Π°Π½Π°Π»ΠΎΠ³ΠΈ Π²Ρ–Π΄ΠΎΠΌΠΈΡ… Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΡ–Π² ΠΌΠ°Ρ‚Ρ€ΠΈΡ‡Π½ΠΎΠ³ΠΎ мноТСння Π·Ρ– змСншСною ΠΎΠ±Ρ‡ΠΈΡΠ»ΡŽΠ²Π°Π»ΡŒΠ½ΠΎΡŽ ΡΠΊΠ»Π°Π΄Π½Ρ–ΡΡ‚ΡŽ, порівняно Π· Π°Π½Π°Π»ΠΎΠ³Π°ΠΌΠΈ, ΠΎΡ‚Ρ€ΠΈΠΌΠ°Π½ΠΈΠΌΠΈ Π½Π° основі Π²Ρ–Π΄ΠΎΠΌΠΈΡ… ΠΊΠ»Ρ–Ρ‚ΠΈΠ½Π½ΠΈΡ… ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ–Π² мноТСння ΠΌΠ°Ρ‚Ρ€ΠΈΡ†ΡŒ. Новий швидкий ΠΊΠ»Ρ–Ρ‚ΠΈΠ½Π½ΠΈΠΉ ΠΌΠ΅Ρ‚ΠΎΠ΄ дозволяє ΠΌΡ–Π½Ρ–ΠΌΡ–Π·ΡƒΠ²Π°Ρ‚ΠΈ Π½Π° 15% ΠΌΡƒΠ»ΡŒΡ‚ΠΈΠΏΠ»Ρ–ΠΊΠ°Ρ‚ΠΈΠ²Π½Ρƒ, Π°Π΄ΠΈΡ‚ΠΈΠ²Π½Ρƒ Ρ– Π·Π°Π³Π°Π»ΡŒΠ½Ρƒ ΡΠΊΠ»Π°Π΄Π½Ρ–ΡΡ‚ΡŒ Π²Ρ–Π΄ΠΎΠΌΠΈΡ… Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΡ–Π² ΠΌΠ°Ρ‚Ρ€ΠΈΡ‡Π½ΠΎΠ³ΠΎ мноТСння. Новий Π·ΠΌΡ–ΡˆΠ°Π½ΠΈΠΉ ΠΊΠ»Ρ–Ρ‚ΠΈΠ½Π½ΠΈΠΉ ΠΌΠ΅Ρ‚ΠΎΠ΄ ΠΏΠΎΡ”Π΄Π½ΡƒΡ” ΠΌΠ΅Ρ‚ΠΎΠ΄ Π›Π΅ΠΉΠ΄Π΅Ρ€ΠΌΠ°Π½Π° Ρ–Π· Π·Π°ΠΏΡ€ΠΎΠΏΠΎΠ½ΠΎΠ²Π°Π½ΠΈΠΌ швидким ΠΊΠ»Ρ–Ρ‚ΠΈΠ½Π½ΠΈΠΌ ΠΌΠ΅Ρ‚ΠΎΠ΄ΠΎΠΌ, Ρ‰ΠΎ ΠΏΡ€ΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚ΡŒ Π΄ΠΎ ΠΌΡ–Π½Ρ–ΠΌΡ–Π·Π°Ρ†Ρ–Ρ— Π½Π° 28% ΠΌΡƒΠ»ΡŒΡ‚ΠΈΠΏΠ»Ρ–ΠΊΠ°Ρ‚ΠΈΠ²Π½ΠΎΡ—, Π°Π΄ΠΈΡ‚ΠΈΠ²Π½ΠΎΡ— Ρ– Π·Π°Π³Π°Π»ΡŒΠ½ΠΎΡ— складності Π·Π°Π·Π½Π°Ρ‡Π΅Π½ΠΈΡ… Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΡ–Π². ΠžΡ†Ρ–Π½ΠΊΠΈ ΠΎΠ±Ρ‡ΠΈΡΠ»ΡŽΠ²Π°Π»ΡŒΠ½ΠΎΡ— складності Ρ†ΠΈΡ… ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ–Π² ΠΏΠΎΠ΄Π°Π½ΠΎ Π½Π° ΠΏΡ€ΠΈΠΊΠ»Π°Π΄Ρ– отримання ΠΊΠ»Ρ–Ρ‚ΠΈΠ½Π½ΠΈΡ… Π°Π½Π°Π»ΠΎΠ³Ρ–Π² Ρ‚Ρ€Π°Π΄ΠΈΡ†Ρ–ΠΉΠ½ΠΎΠ³ΠΎ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΡƒ мноТСння ΠΌΠ°Ρ‚Ρ€ΠΈΡ†ΡŒ.The paper proposes two new cellular methods of matrix multiplication, which allow obtaining cellular analogs of the well-known matrix multiplication algorithms with reduced computational complexity, as compared with the analogs derived on the basis of the well-known cellular methods of matrix multiplication. The new fast cellular method reduces by 15% the multiplicative, additive, and overall complexities of the mentioned algorithms. The new mixed cellular method combines the Laderman method with the proposed fast cellular method. The interaction of these methods reduces by 28% the multiplicative, additive, and overall complexities of the matrix multiplication algorithms. The computational complexity of these methods are estimated using the model of getting cellular analogs of the traditional matrix multiplication algorithm

    Recovery from Linear Measurements with Complexity-Matching Universal Signal Estimation

    Full text link
    We study the compressed sensing (CS) signal estimation problem where an input signal is measured via a linear matrix multiplication under additive noise. While this setup usually assumes sparsity or compressibility in the input signal during recovery, the signal structure that can be leveraged is often not known a priori. In this paper, we consider universal CS recovery, where the statistics of a stationary ergodic signal source are estimated simultaneously with the signal itself. Inspired by Kolmogorov complexity and minimum description length, we focus on a maximum a posteriori (MAP) estimation framework that leverages universal priors to match the complexity of the source. Our framework can also be applied to general linear inverse problems where more measurements than in CS might be needed. We provide theoretical results that support the algorithmic feasibility of universal MAP estimation using a Markov chain Monte Carlo implementation, which is computationally challenging. We incorporate some techniques to accelerate the algorithm while providing comparable and in many cases better reconstruction quality than existing algorithms. Experimental results show the promise of universality in CS, particularly for low-complexity sources that do not exhibit standard sparsity or compressibility.Comment: 29 pages, 8 figure

    SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications

    Full text link
    Self-attention has become a defacto choice for capturing global context in various vision applications. However, its quadratic computational complexity with respect to image resolution limits its use in real-time applications, especially for deployment on resource-constrained mobile devices. Although hybrid approaches have been proposed to combine the advantages of convolutions and self-attention for a better speed-accuracy trade-off, the expensive matrix multiplication operations in self-attention remain a bottleneck. In this work, we introduce a novel efficient additive attention mechanism that effectively replaces the quadratic matrix multiplication operations with linear element-wise multiplications. Our design shows that the key-value interaction can be replaced with a linear layer without sacrificing any accuracy. Unlike previous state-of-the-art methods, our efficient formulation of self-attention enables its usage at all stages of the network. Using our proposed efficient additive attention, we build a series of models called "SwiftFormer" which achieves state-of-the-art performance in terms of both accuracy and mobile inference speed. Our small variant achieves 78.5% top-1 ImageNet-1K accuracy with only 0.8 ms latency on iPhone 14, which is more accurate and 2x faster compared to MobileViT-v2. Code: https://github.com/Amshaker/SwiftFormerComment: Technical repor

    Time for dithering: fast and quantized random embeddings via the restricted isometry property

    Full text link
    Recently, many works have focused on the characterization of non-linear dimensionality reduction methods obtained by quantizing linear embeddings, e.g., to reach fast processing time, efficient data compression procedures, novel geometry-preserving embeddings or to estimate the information/bits stored in this reduced data representation. In this work, we prove that many linear maps known to respect the restricted isometry property (RIP) can induce a quantized random embedding with controllable multiplicative and additive distortions with respect to the pairwise distances of the data points beings considered. In other words, linear matrices having fast matrix-vector multiplication algorithms (e.g., based on partial Fourier ensembles or on the adjacency matrix of unbalanced expanders) can be readily used in the definition of fast quantized embeddings with small distortions. This implication is made possible by applying right after the linear map an additive and random "dither" that stabilizes the impact of the uniform scalar quantization operator applied afterwards. For different categories of RIP matrices, i.e., for different linear embeddings of a metric space (KβŠ‚Rn,β„“q)(\mathcal K \subset \mathbb R^n, \ell_q) in (Rm,β„“p)(\mathbb R^m, \ell_p) with p,qβ‰₯1p,q \geq 1, we derive upper bounds on the additive distortion induced by quantization, showing that it decays either when the embedding dimension mm increases or when the distance of a pair of embedded vectors in K\mathcal K decreases. Finally, we develop a novel "bi-dithered" quantization scheme, which allows for a reduced distortion that decreases when the embedding dimension grows and independently of the considered pair of vectors.Comment: Keywords: random projections, non-linear embeddings, quantization, dither, restricted isometry property, dimensionality reduction, compressive sensing, low-complexity signal models, fast and structured sensing matrices, quantized rank-one projections (31 pages

    НовыС быстрыС Π³ΠΈΠ±Ρ€ΠΈΠ΄Π½Ρ‹Π΅ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΡ‹ умноТСния ΠΌΠ°Ρ‚Ρ€ΠΈΡ†

    No full text
    Π—Π°ΠΏΡ€ΠΎΠΏΠΎΠ½ΠΎΠ²Π°Π½ΠΎ Π½ΠΎΠ²i Π³iΠ±Ρ€ΠΈΠ΄Π½i Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠΈ мноТСння (n x n)-ΠΌΠ°Ρ‚Ρ€ΠΈΡ†ΡŒ, ΠΏΡ€ΠΈ ΠΏΠΎΠ±ΡƒΠ΄ΠΎΠ²Ρ– яких використано Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌ Π›Π΅ΠΉΠ΄Π΅Ρ€ΠΌΠ°Π½Π° для мноТСння (3 x 3)-ΠΌΠ°Ρ‚Ρ€ΠΈΡ†ΡŒ. ΠŸΠΎΡ€Ρ–Π²Π½ΡΠ½ΠΎ Π· Π²Ρ–Π΄ΠΎΠΌΠΈΠΌΠΈ Π³Ρ–Π±Ρ€ΠΈΠ΄Π½ΠΈΠΌΠΈ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠ°ΠΌΠΈ мноТСння ΠΌΠ°Ρ‚Ρ€ΠΈΡ†ΡŒ Π½ΠΎΠ²Ρ– Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠΈ Ρ…Π°Ρ€Π°ΠΊΡ‚Π΅Ρ€ΠΈΠ·ΡƒΡŽΡ‚ΡŒΡΡ ΠΌΡ–Π½Ρ–ΠΌΡ–Π·ΠΎΠ²Π°Π½ΠΎΡŽ ΠΎΠ±Ρ‡ΠΈΡΠ»ΡŽΠ²Π°Π½ΠΎΡŽ ΡΠΊΠ»Π°Π΄Π½Ρ–ΡΡ‚ΡŽ. НавСдСно ΠΎΡ†Ρ–Π½ΠΊΠΈ ΠΌΡƒΠ»ΡŒΡ‚ΠΈΠΏΠ»Ρ–ΠΊΠ°Ρ‚ΠΈΠ²Π½ΠΎΡ—, Π°Π΄ΠΈΡ‚ΠΈΠ²Π½ΠΎΡ— Ρ‚Π° Π·Π°Π³Π°Π»ΡŒΠ½ΠΎΡ— складності Π² прСдставлСних Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠ°Ρ….New hybrid algorithms are proposed for multiplying (n x n)-matrices. They are based on Laderman’s algorithm for multiplying (3 x 3)-matrices. As compared with the well-known hybrid matrix multiplication algorithms, the new algorithms are characterized by the minimum computational complexity. The multiplicative, additive, and overall complexities of the algorithms are estimated

    Composite Cyclotomic Fourier Transforms with Reduced Complexities

    Full text link
    Discrete Fourier transforms~(DFTs) over finite fields have widespread applications in digital communication and storage systems. Hence, reducing the computational complexities of DFTs is of great significance. Recently proposed cyclotomic fast Fourier transforms (CFFTs) are promising due to their low multiplicative complexities. Unfortunately, there are two issues with CFFTs: (1) they rely on efficient short cyclic convolution algorithms, which has not been investigated thoroughly yet, and (2) they have very high additive complexities when directly implemented. In this paper, we address both issues. One of the main contributions of this paper is efficient bilinear 11-point cyclic convolution algorithms, which allow us to construct CFFTs over GF(211)(2^{11}). The other main contribution of this paper is that we propose composite cyclotomic Fourier transforms (CCFTs). In comparison to previously proposed fast Fourier transforms, our CCFTs achieve lower overall complexities for moderate to long lengths, and the improvement significantly increases as the length grows. Our 2047-point and 4095-point CCFTs are also first efficient DFTs of such lengths to the best of our knowledge. Finally, our CCFTs are also advantageous for hardware implementations due to their regular and modular structure.Comment: submitted to IEEE trans on Signal Processin

    On fast multiplication of a matrix by its transpose

    Get PDF
    We present a non-commutative algorithm for the multiplication of a 2x2-block-matrix by its transpose using 5 block products (3 recursive calls and 2 general products) over C or any finite field.We use geometric considerations on the space of bilinear forms describing 2x2 matrix products to obtain this algorithm and we show how to reduce the number of involved additions.The resulting algorithm for arbitrary dimensions is a reduction of multiplication of a matrix by its transpose to general matrix product, improving by a constant factor previously known reductions.Finally we propose schedules with low memory footprint that support a fast and memory efficient practical implementation over a finite field.To conclude, we show how to use our result in LDLT factorization.Comment: ISSAC 2020, Jul 2020, Kalamata, Greec
    • …
    corecore