203 research outputs found

    New FFT structures based on the Bruun algorithm

    Get PDF

    A Pipelined FFT Architecture for Real-Valued Signals

    Full text link

    Algorithmic Views of Vectorized Polynomial Multipliers for NTRU and NTRU Prime (Long Paper)

    Get PDF
    This paper explores the design space of vector-optimized polynomial multiplications in the lattice-based key-encapsulation mechanisms NTRU and NTRU Prime. Since NTRU and NTRU Prime do not support straightforward applications of number– theoretic transforms, the state-of-the-art vector code either resorted to Toom–Cook, or introduced various techniques for coefficient ring extensions. All these techniques lead to a large number of small-degree polynomial multiplications, which is the bottleneck in our experiments. For NTRU Prime, we show how to reduce the number of small-degree polynomial multiplications to nearly 1/4 times compared to the previous vectorized code with the same functionality. Our transformations are based on careful choices of FFTs, including Good–Thomas, Rader’s, Schönhage’s, and Bruun’s FFTs. For NTRU, we show how to deploy Toom-5 with 3-bit losses. Furthermore, we show that the Toeplitz matrix–vector product naturally translates into efficient implementations with vector-by-scalar multiplication instructions which do not appear in all prior vector-optimized implementations. We choose the ARM Cortex-A72 CPU which implements the Armv8-A architecture for experiments, because of its wide uses in smartphones, and also the Neon vector instruction set implementing vector-by-scalar multiplications that do not appear in most other vector instruction sets like Intel’s AVX2. Even for platforms without vector-by-scalar multiplications, we expect significant improvements compared to the state of the art, since our transformations reduce the number of multiplication instructions by a large margin. Compared to the state-of-the-art optimized implementations, we achieve 2.18× and 6.7× faster polynomial multiplications for NTRU and NTRU Prime, respectively. For full schemes, we additionally vectorize the polynomial inversions, sorting network, and encoding/decoding subroutines in NTRU and NTRU Prime. For ntruhps2048677, we achieve 7.67×, 2.48×, and 1.77× faster key generation, encapsulation, and decapsulation, respectively. For ntrulpr761, we achieve 3×, 2.87×, and 3.25× faster key generation, encapsulation, and decapsulation, respectively. For sntrup761, there are no previously optimized implementations and we significantly outperform the reference implementation

    Design considerations for a digital audio Class D output stage with emphasis on hearing aid application

    Get PDF

    An Optical Flow Measurement Technique based on Continuous Wavelet Transform

    Get PDF
    Flow measurement underwater oil leak is a challenging problem, due to the complex nature of flow dynamics. Oil jet flow associated with a multi-scale coherent structure in both space and time direction. Optical plume velocimetry (OPV) was developed by (Crone, McDuff, and Wilcock, 2008), and it was the most accurate technique that used for oil leak flow measurement. Despite its better estimation, the OPV measured the oil flow rate with high uncertainty of 21%. This is due to the multi-scale phenomena of oil flow, as well as the limited accuracy of direct cross correlation (DCC) typically used by OPV. This paper proposed a novel technique that considers the multi-scale property of turbulence in flow measurement. The proposed technique is based on continuous wavelet transform and estimates the flow using the following steps: Decomposition of turbulent flow signal by using continuous wavelet transform (CWT), correlation coefficient estimation in which Fast Fourier Transform (FFT) algorithm was used, interpolation and peak detection for the estimated correlation coefficients, and finally, the velocity field estimation. In order to validate the CWT-based technique, a turbulent buoyant jet, which has a similar flow-type of oil jet was experimentally simulated. Then, the CWT-based technique was applied to measure the jet flow, and the outcomes of the technique was compared to the experimental results. As a result, utilizing a smaller number of wavelet scales lead in better flow measurement as compared to the use of larger scales. CWT-based technique was accurately estimated the jet flow rate with standard error of 0.15 m/s, and outperformed the classical algorithms, including FFT, and DCC algorithms, which were measured with error of 3.65 m/s and 4.53 m/s respectively
    • …
    corecore