Search CORE

11,933 research outputs found

New virtually scaling free adaptive CORDIC rotator

Author: Banerjee Swapna
Grass Eckhard
Maharatna Koushik
Troya Alfonso
Publication venue
Publication date: 01/11/2004
Field of study

In this article we propose a novel CORDIC rotator algorithm that eliminates the problems of scale factor compensation and limited range of convergence associated with the classical CORDIC algorithm. In our scheme, depending on the target angle or the initial coordinate of the vector, a scaling by 1 or 1/?2 is needed that can be realised with minimal hardware. The proposed CORDIC rotator adaptively selects appropriate iteration steps and converges to the final result by executing 50% less number of iterations on an average compared to that required for the classical CORDIC. Unlike classical CORDIC, the final value of the scale factor is completely independent of number of executed iterations. Based on the proposed algorithm, a 16-bit pipelined CORDIC rotator implementation has been described. The silicon area of the fabricated pipelined CORDIC rotator core is 2.73 mm2. This is equivalent to 38 k inverter gates in IHP in-house 0.25 ?m BiCMOS technology. The average dynamic power consumption of the fabricated CORDIC rotator is 17 mW @ 2.5 V supply and 20Msps throughput. Currently, this CORDIC rotator is used as a part of the baseband processor for a project that aims to design a single-chip wireless modem compliant with IEEE 802.11a and Hiperlan/2

Southampton (e-Prints Soton)

Asynchronous techniques for system-on-chip design

Author: Martin Alain J.
Nyström Mika
Publication venue
Publication date: 01/06/2006
Field of study

SoC design will require asynchronous techniques as the large parameter variations across the chip will make it impossible to control delays in clock networks and other global signals efficiently. Initially, SoCs will be globally asynchronous and locally synchronous (GALS). But the complexity of the numerous asynchronous/synchronous interfaces required in a GALS will eventually lead to entirely asynchronous solutions. This paper introduces the main design principles, methods, and building blocks for asynchronous VLSI systems, with an emphasis on communication and synchronization. Asynchronous circuits with the only delay assumption of isochronic forks are called quasi-delay-insensitive (QDI). QDI is used in the paper as the basis for asynchronous logic. The paper discusses asynchronous handshake protocols for communication and the notion of validity/neutrality tests, and completion tree. Basic building blocks for sequencing, storage, function evaluation, and buses are described, and two alternative methods for the implementation of an arbitrary computation are explained. Issues of arbitration, and synchronization play an important role in complex distributed systems and especially in GALS. The two main asynchronous/synchronous interfaces needed in GALS-one based on synchronizer, the other on stoppable clock-are described and analyzed

Caltech Authors

Scoring methods, multiple criteria, and utility analysis

Author: Kleijnen J.P.C.
Publication venue
Publication date
Field of study

decision making;computers;information and uncertainty

Research Papers in Economics

VLSI Implementation of Deep Neural Network Using Integral Stochastic Computing

Author: Ardakani Arash
Gross Warren J.
Hanyu Takahiro
Leduc-Primeau François
Onizawa Naoya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

The hardware implementation of deep neural networks (DNNs) has recently received tremendous attention: many applications in fact require high-speed operations that suit a hardware implementation. However, numerous elements and complex interconnections are usually required, leading to a large area occupation and copious power consumption. Stochastic computing has shown promising results for low-power area-efficient hardware implementations, even though existing stochastic algorithms require long streams that cause long latencies. In this paper, we propose an integer form of stochastic computation and introduce some elementary circuits. We then propose an efficient implementation of a DNN based on integral stochastic computing. The proposed architecture has been implemented on a Virtex7 FPGA, resulting in 45% and 62% average reductions in area and latency compared to the best reported architecture in literature. We also synthesize the circuits in a 65 nm CMOS technology and we show that the proposed integral stochastic architecture results in up to 21% reduction in energy consumption compared to the binary radix implementation at the same misclassification rate. Due to fault-tolerant nature of stochastic architectures, we also consider a quasi-synchronous implementation which yields 33% reduction in energy consumption w.r.t. the binary radix implementation without any compromise on performance.Comment: 11 pages, 12 figure

arXiv.org e-Print Archive

Crossref

HAL-Université de Bretagne Occidentale

HAL Descartes

PolyPublie

Hal-Diderot