817 research outputs found

    A 16-bit CORDIC rotator for high-performance wireless LAN

    No full text
    In this paper we propose a novel 16-bit low power CORDIC rotator that is used for high-speed wireless LAN. The algorithm converges to the final target angle by adaptively selecting appropriate iteration steps while keeping the scale factor virtually constant. The VLSI architecture of the proposed design eliminates the entire arithmetic hardware in the angle approximation datapath and reduces the number of iterations by 50% on an average. The cell area of the processor is 0.7 mm2 and it dissipates 7 mW power at 20 MHz frequency

    On the hardware reduction of z-datapath of vectoring CORDIC

    No full text
    In this article we present a novel design of a hardware optimal vectoring CORDIC processor. We present a mathematical theory to show that using bipolar binary notation it is possible to eliminate all the arithmetic computations required along the z-datapath. Using this technique it is possible to achieve three and 1.5 times reduction in the number of registers and adder respectively compared to conventional CORDIC. Following this, a 16-bit vectoring CORDIC is designed for the application in Synchronizer for IEEE 802.11a standard. The total area and dynamic power consumption of the processor is 0.14 mm2 and 700?W respectively when synthesized in 0.18?m CMOS library which shows its effectiveness as a low-area low-power processor

    A VLSI Array Architecture for Realization of DFT, DHT, DCT and DST

    No full text
    A unified array architecture is described for computation of DFT, DHT, DCT and DST using a modified CORDIC (CoOrdinate Rotation DIgital Computer) arithmetic unit as the basic Processing Element (PE). All these four transforms can be computed by simple rearrangement of input samples. Compared to five other existing architectures, this one has the advantage in speed in terms of latency and throughput. Moreover, the simple local neighborhood interprocessor connections make it convenient for VLSI implementation. The architecture can be extended to compute transformation of longer length by judicially cascading the modules of shorter transformation length which will be suitable for Wafer Scale Integration (WSI). CORDIC is designed using Transmission Gate Logic (TGL) on sea of gates semicustom environment. Simulation results show that this architecture may be a suitable candidate for low power/low voltage applications

    FPGA Implementation of Spectral Subtraction for In-Car Speech Enhancement and Recognition

    Get PDF
    The use of speech recognition in noisy environments requires the use of speech enhancement algorithms in order to improve recognition performance. Deploying these enhancement techniques requires significant engineering to ensure algorithms are realisable in electronic hardware. This paper describes the design decisions and process to port the popular spectral subtraction algorithm to a Virtex-4 field-programmable gate array (FPGA) device. Resource analysis shows the final design uses only 13% of the total available FPGA resources. Waveforms and spectrograms presented support the validity of the proposed FPGA design

    Computational structures for robotic computations

    Get PDF
    The computational problem of inverse kinematics and inverse dynamics of robot manipulators by taking advantage of parallelism and pipelining architectures is discussed. For the computation of inverse kinematic position solution, a maximum pipelined CORDIC architecture has been designed based on a functional decomposition of the closed-form joint equations. For the inverse dynamics computation, an efficient p-fold parallel algorithm to overcome the recurrence problem of the Newton-Euler equations of motion to achieve the time lower bound of O(log sub 2 n) has also been developed

    A low-power geometric mapping co-processor for high-speed graphics application

    No full text
    In this article we present a novel design of a low-power geometric mapping co-processor that can be used for high-performance graphics system. The processor can carry out any single or a combination of transformations belonging to affine transformation family ranging from 1-D to 3-D. It allows interactive operations which can be defined either by a user (allowing it to be a stand-alone geometric transformation processor) or by a host processor (allowing it to be a co-processor to accelerate certain graphics operations). It occupies a silicon area of 6 mm2 and consumes 40 mW power when synthesized with 0.25?m technology

    DeepSecure: Scalable Provably-Secure Deep Learning

    Get PDF
    This paper proposes DeepSecure, a novel framework that enables scalable execution of the state-of-the-art Deep Learning (DL) models in a privacy-preserving setting. DeepSecure targets scenarios in which neither of the involved parties including the cloud servers that hold the DL model parameters or the delegating clients who own the data is willing to reveal their information. Our framework is the first to empower accurate and scalable DL analysis of data generated by distributed clients without sacrificing the security to maintain efficiency. The secure DL computation in DeepSecure is performed using Yao's Garbled Circuit (GC) protocol. We devise GC-optimized realization of various components used in DL. Our optimized implementation achieves more than 58-fold higher throughput per sample compared with the best-known prior solution. In addition to our optimized GC realization, we introduce a set of novel low-overhead pre-processing techniques which further reduce the GC overall runtime in the context of deep learning. Extensive evaluations of various DL applications demonstrate up to two orders-of-magnitude additional runtime improvement achieved as a result of our pre-processing methodology. This paper also provides mechanisms to securely delegate GC computations to a third party in constrained embedded settings
    • 

    corecore