126 research outputs found

    Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations

    Get PDF
    Fast linear transforms are ubiquitous in machine learning, including the discrete Fourier transform, discrete cosine transform, and other structured transformations such as convolutions. All of these transforms can be represented by dense matrix-vector multiplication, yet each has a specialized and highly efficient (subquadratic) algorithm. We ask to what extent hand-crafting these algorithms and implementations is necessary, what structural priors they encode, and how much knowledge is required to automatically learn a fast algorithm for a provided structured transform. Motivated by a characterization of fast matrix-vector multiplication as products of sparse matrices, we introduce a parameterization of divide-and-conquer methods that is capable of representing a large class of transforms. This generic formulation can automatically learn an efficient algorithm for many important transforms; for example, it recovers the O(NlogN)O(N \log N) Cooley-Tukey FFT algorithm to machine precision, for dimensions NN up to 10241024. Furthermore, our method can be incorporated as a lightweight replacement of generic matrices in machine learning pipelines to learn efficient and compressible transformations. On a standard task of compressing a single hidden-layer network, our method exceeds the classification accuracy of unconstrained matrices on CIFAR-10 by 3.9 points---the first time a structured approach has done so---with 4X faster inference speed and 40X fewer parameters

    Joint Optimization of Low-power DCT Architecture and Effcient Quantization Technique for Embedded Image Compression

    Get PDF
    International audienceThe Discrete Cosine Transform (DCT)-based image com- pression is widely used in today's communication systems. Signi cant research devoted to this domain has demonstrated that the optical com- pression methods can o er a higher speed but su er from bad image quality and a growing complexity. To meet the challenges of higher im- age quality and high speed processing, in this chapter, we present a joint system for DCT-based image compression by combining a VLSI archi- tecture of the DCT algorithm and an e cient quantization technique. Our approach is, rstly, based on a new granularity method in order to take advantage of the adjacent pixel correlation of the input blocks and to improve the visual quality of the reconstructed image. Second, a new architecture based on the Canonical Signed Digit and a novel Common Subexpression Elimination technique is proposed to replace the constant multipliers. Finally, a recon gurable quantization method is presented to e ectively save the computational complexity. Experimental results obtained with a prototype based on FPGA implementation and com- parisons with existing works corroborate the validity of the proposed optimizations in terms of power reduction, speed increase, silicon area saving and PSNR improvement

    Orthonormal and biorthonormal filter banks as convolvers, and convolutional coding gain

    Get PDF
    Convolution theorems for filter bank transformers are introduced. Both uniform and nonuniform decimation ratios are considered, and orthonormal as well as biorthonormal cases are addressed. All the theorems are such that the original convolution reduces to a sum of shorter, decoupled convolutions in the subbands. That is, there is no need to have cross convolution between subbands. For the orthonormal case, expressions for optimal bit allocation and the optimized coding gain are derived. The contribution to coding gain comes partly from the nonuniformity of the signal spectrum and partly from nonuniformity of the filter spectrum. With one of the convolved sequences taken to be the unit pulse function,,e coding gain expressions reduce to those for traditional subband and transform coding. The filter-bank convolver has about the same computational complexity as a traditional convolver, if the analysis bank has small complexity compared to the convolution itself

    Advanced digital and analog error correction codes

    Get PDF

    Matrix Transform Imager Architecture for On-Chip Low-Power Image Processing

    Get PDF
    Camera-on-a-chip systems have tried to include carefully chosen signal processing units for better functionality, performance and also to broaden the applications they can be used for. Image processing sensors have been possible due advances in CMOS active pixel sensors (APS) and neuromorphic focal plane imagers. Some of the advantages of these systems are compact size, high speed and parallelism, low power dissipation, and dense system integration. One can envision using these chips for portable and inexpensive video cameras on hand-held devices like personal digital assistants (PDA) or cell-phones In neuromorphic modeling of the retina it would be very nice to have processing capabilities at the focal plane while retaining the density of typical APS imager designs. Unfortunately, these two goals have been mostly incompatible. We introduce our MAtrix Transform Imager Architecture (MATIA) that uses analog floating--gate devices to make it possible to have computational imagers with high pixel densities. The core imager performs computations at the pixel plane, but still has a fill-factor of 46 percent - comparable to the high fill-factors of APS imagers. The processing is performed continuously on the image via programmable matrix operations that can operate on the entire image or blocks within the image. The resulting data-flow architecture can directly perform all kinds of block matrix image transforms. Since the imager operates in the subthreshold region and thus has low power consumption, this architecture can be used as a low-power front end for any system that utilizes these computations. Various compression algorithms (e.g. JPEG), that use block matrix transforms, can be implemented using this architecture. Since MATIA can be used for gradient computations, cheap image tracking devices can be implemented using this architecture. Other applications of this architecture can range from stand-alone universal transform imager systems to systems that can compute stereoscopic depth.Ph.D.Committee Chair: Hasler, Paul; Committee Member: David Anderson; Committee Member: DeWeerth, Steve; Committee Member: Jackson, Joel; Committee Member: Smith, Mar
    corecore