Search CORE

904 research outputs found

Recommended from our members

Two-dimensional DCT/IDCT architecture

Author: Aggoun A
Jollah I
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2003
Field of study

A fully parallel architecture for the computation of a two-dimensional (2-D) discrete cosine transform (DCT), based on row-column decomposition is presented. It uses the same one dimensional (1-D) DCT unit for the row and column computations and (N2+N) registers to perform the transposition. It possesses features of regularity and modularity, and is thus well suited for VLSI implementation. It can be used for the computation of either the forward or the inverse 2-D DCT. Each 1-D DCT unit uses N fully parallel vector inner product (VIP) units. The design of the VIP units is based on a systematic design methodology using radix-2” arithmetic, which allows partitioning of the elements of each vector into small groups. Array multipliers without the final adder are used to produce the different partial product terms. This allows a more efficient use of 4:2 compressors for the accumulation of the products in the intermediate stages and reduces the number of accumulators from N to one. Using this procedure, the 2-D DCT architecture requires less than N2 multipliers (in terms of area occupied) and only 2N adders. It can compute a N x N-point DCT at a rate of one complete transform per N cycles after an appropriate initial delay

Brunel University Research Archive

A 64-point Fourier transform chip for high-speed wireless LAN application using OFDM

Author: Grass Eckhard
Jagdhold Ulrich
Maharatna Koushik
Publication venue
Publication date: 01/03/2004
Field of study

In this article, we present a novel fixed-point 16-bit word-width 64-point FFT/IFFT processor developed primarily for the application in the OFDM based IEEE 802.11a Wireless LAN (WLAN) baseband processor. The 64-point FFT is realized by decomposing it into a 2-D structure of 8-point FFTs. This approach reduces the number of required complex multiplications compared to the conventional radix-2 64-point FFT algorithm. The complex multiplication operations are realized using shift-and-add operations. Thus, the processor does not use any 2-input digital multiplier. It also does not need any RAM or ROM for internal storage of coefficients. The proposed 64-point FFT/IFFT processor has been fabricated and tested successfully using our in-house 0.25 ?m BiCMOS technology. The core area of this chip is 6.8 mm2. The average dynamic power consumption is 41 mW @ 20 MHz operating frequency and 1.8 V supply voltage. The processor completes one parallel-to-parallel (i. e., when all input data are available in parallel and all output data are generated in parallel) 64-point FFT computation in 23 cycles. These features show that though it has been developed primarily for application in the IEEE 802.11a standard, it can be used for any application that requires fast operation as well as low power consumption

Southampton (e-Prints Soton)

Explore Bristol Research

Architectures for block Toeplitz systems

Author: Bouras Ilias
Glentis George-Othon
Kalouptsidis Nicholas
Publication venue: Elsevier
Publication date: 01/01/1995
Field of study

In this paper efficient VLSI architectures of highly concurrent algorithms for the solution of block linear systems with Toeplitz or near-to-Toeplitz entries are presented. The main features of the proposed scheme are the use of scalar only operations, multiplications/divisions and additions, and the local communication which enables the development of wavefront array architecture. Both the mean squared error and the total squared error formulations are described and a variety of implementations are given

CiteSeerX

University of Twente Research Information

A comparison of VLSI architectures for time and transform domain decoding of Reed-Solomon codes

Author: Deutsch L. J.
Hsu I. S.
Reed I. S.
Satorius E. H.
Truong T. K.
Publication venue
Publication date
Field of study

It is well known that the Euclidean algorithm or its equivalent, continued fractions, can be used to find the error locator polynomial needed to decode a Reed-Solomon (RS) code. It is shown that this algorithm can be used for both time and transform domain decoding by replacing its initial conditions with the Forney syndromes and the erasure locator polynomial. By this means both the errata locator polynomial and the errate evaluator polynomial can be obtained with the Euclidean algorithm. With these ideas, both time and transform domain Reed-Solomon decoders for correcting errors and erasures are simplified and compared. As a consequence, the architectures of Reed-Solomon decoders for correcting both errors and erasures can be made more modular, regular, simple, and naturally suitable for VLSI implementation

NASA Technical Reports Server

Bit-level pipelined digit-serial array processors

Author: Aggoun A
Ashur A
Ibrahim MK
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/1998
Field of study

A new architecture for high performance digit-serial vector inner product (VIP) which can be pipelined to the bit-level is introduced. The design of the digit-serial vector inner product is based on a new systematic design methodology using radix-2n arithmetic. The proposed architecture allows a high level of bit-level pipelining to increase the throughput rate with minimum initial delay and minimum area. This will give designers greater flexibility in finding the best tradeoff between hardware cost and throughput rate. It is shown that sub-digit pipelined digit-serial structure can achieve a higher throughput rate with much less area consumption than an equivalent bit-parallel structure. A twin-pipe architecture to double the throughput rate of digit-serial multipliers and consequently that of the digit-serial vector inner product is also presented. The effect of the number of pipelining levels and the twin-pipe architecture on the throughput rate and hardware cost are discussed. A two's complement digit-serial architecture which can operate on both negative and positive numbers is also presented

Crossref

Brunel University Research Archive

A compact multi-chip-module implementation of a multi-precision neural network classifier

Author: Bermak Amine
Martinez Dominique
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2001
Field of study

This paper describes a novel MCM digital implementation of a reconfigurable multi-precision neural network classifier. The design is based on a scalable systolic architecture with a user defined topology and arithmetic precision of the neural network. Indeed, the MCM integrates 64/32/16 neurons with a corresponding accuracy of 4/8/16-bits. A prototype has been designed and successfully tested in CMOS 0.7 μm technolog

INRIA a CCSD electronic archive server

Research Online @ ECU