79,692 research outputs found

    A Many-Core Overlay for High-Performance Embedded Computing on FPGAs

    Get PDF
    In this work, we propose a configurable many-core overlay for high-performance embedded computing. The size of internal memory, supported operations and number of ports can be configured independently for each core of the overlay. The overlay was evaluated with matrix multiplication, LU decomposition and Fast-Fourier Transform (FFT) on a ZYNQ-7020 FPGA platform. The results show that using a system-level many-core overlay avoids complex hardware design and still provides good performance results.Comment: Presented at First International Workshop on FPGAs for Software Programmers (FSP 2014) (arXiv:1408.4423

    Fast Quantum Fourier Transforms for a Class of Non-abelian Groups

    Full text link
    An algorithm is presented allowing the construction of fast Fourier transforms for any solvable group on a classical computer. The special structure of the recursion formula being the core of this algorithm makes it a good starting point to obtain systematically fast Fourier transforms for solvable groups on a quantum computer. The inherent structure of the Hilbert space imposed by the qubit architecture suggests to consider groups of order 2^n first (where n is the number of qubits). As an example, fast quantum Fourier transforms for all 4 classes of non-abelian 2-groups with cyclic normal subgroup of index 2 are explicitly constructed in terms of quantum circuits. The (quantum) complexity of the Fourier transform for these groups of size 2^n is O(n^2) in all cases.Comment: 16 pages, LaTeX2

    Algebraic Signal Processing Theory: Cooley-Tukey Type Algorithms for Polynomial Transforms Based on Induction

    Full text link
    A polynomial transform is the multiplication of an input vector x\in\C^n by a matrix \PT_{b,\alpha}\in\C^{n\times n}, whose (k,ℓ)(k,\ell)-th element is defined as pℓ(αk)p_\ell(\alpha_k) for polynomials p_\ell(x)\in\C[x] from a list b={p0(x),
,pn−1(x)}b=\{p_0(x),\dots,p_{n-1}(x)\} and sample points \alpha_k\in\C from a list α={α0,
,αn−1}\alpha=\{\alpha_0,\dots,\alpha_{n-1}\}. Such transforms find applications in the areas of signal processing, data compression, and function interpolation. Important examples include the discrete Fourier and cosine transforms. In this paper we introduce a novel technique to derive fast algorithms for polynomial transforms. The technique uses the relationship between polynomial transforms and the representation theory of polynomial algebras. Specifically, we derive algorithms by decomposing the regular modules of these algebras as a stepwise induction. As an application, we derive novel O(nlog⁥n)O(n\log{n}) general-radix algorithms for the discrete Fourier transform and the discrete cosine transform of type 4.Comment: 19 pages. Submitted to SIAM Journal on Matrix Analysis and Application

    Algebraic Signal Processing Theory: Cooley-Tukey Type Algorithms for DCTs and DSTs

    Full text link
    This paper presents a systematic methodology based on the algebraic theory of signal processing to classify and derive fast algorithms for linear transforms. Instead of manipulating the entries of transform matrices, our approach derives the algorithms by stepwise decomposition of the associated signal models, or polynomial algebras. This decomposition is based on two generic methods or algebraic principles that generalize the well-known Cooley-Tukey FFT and make the algorithms' derivations concise and transparent. Application to the 16 discrete cosine and sine transforms yields a large class of fast algorithms, many of which have not been found before.Comment: 31 pages, more information at http://www.ece.cmu.edu/~smar

    Coherent optical implementations of the fast Fourier transform and their comparison to the optical implementation of the quantum Fourier transform

    Get PDF
    Optical structures to implement the discrete Fourier transform (DFT) and fast Fourier transform (FFT) algorithms for discretely sampled data sets are considered. In particular, the decomposition of the FFT algorithm into the basic Butterfly operations is described, as this allows the algorithm to be fully implemented by the successive coherent addition and subtraction of two wavefronts (the subtraction being performed after one has been appropriately phase shifted), so facilitating a simple and robust hardware implementation based on waveguided hybrid devices as employed in coherent optical detection modules. Further, a comparison is made to the optical structures proposed for the optical implementation of the quantum Fourier transform and they are shown to be very similar
    • 

    corecore