Search CORE

2,005 research outputs found

On the VLSI design of a pipeline Reed-Solomon decoder using systolic arrays

Author: Deutsch L. J.
Reed I. S.
Shao H. M.
Publication venue
Publication date
Field of study

A new very large scale integration (VLSI) design of a pipeline Reed-Solomon decoder is presented. The transform decoding technique used in a previous article is replaced by a time domain algorithm through a detailed comparison of their VLSI implementations. A new architecture that implements the time domain algorithm permits efficient pipeline processing with reduced circuitry. Erasure correction capability is also incorporated with little additional complexity. By using a multiplexing technique, a new implementation of Euclid's algorithm maintains the throughput rate with less circuitry. Such improvements result in both enhanced capability and significant reduction in silicon area

NASA Technical Reports Server

Arithmetic Operations in Multi-Valued Logic

Author: Gurumurthy K. S.
Patel Vasundara
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 29/03/2010
Field of study

This paper presents arithmetic operations like addition, subtraction and multiplications in Modulo-4 arithmetic, and also addition, multiplication in Galois field, using multi-valued logic (MVL). Quaternary to binary and binary to quaternary converters are designed using down literal circuits. Negation in modular arithmetic is designed with only one gate. Logic design of each operation is achieved by reducing the terms using Karnaugh diagrams, keeping minimum number of gates and depth of net in to consideration. Quaternary multiplier circuit is proposed to achieve required optimization. Simulation result of each operation is shown separately using Hspice.Comment: 12 Pages, VLSICS Journal 201

arXiv.org e-Print Archive

Crossref

Embedded FIR filter design for real-time refocusing using a standard plenoptic video camera

Author: Adelson
Coffey
Dansereau
Fatah
Isaksen
Ives
Levoy
Lippmann
Lumsdaine
Ng
Rodríguez-Ramos
Rodríguez-Ramos
Sokolov
Wimalagunarathne
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 03/02/2014
Field of study

Copyright 2014 Society of Photo-Optical Instrumentation Engineers and IS&T—The Society for Imaging Science and Technology. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited.A novel and low-cost embedded hardware architecture for real-time refocusing based on a standard plenoptic camera is presented in this study. The proposed layout design synthesizes refocusing slices directly from micro images by omitting the process for the commonly used sub-aperture extraction. Therefore, intellectual property cores, containing switch controlled Finite Impulse Response (FIR) filters, are developed and applied to the Field Programmable Gate Array (FPGA) XC6SLX45 from Xilinx. Enabling the hardware design to work economically, the FIR filters are composed of stored product as well as upsampling and interpolation techniques in order to achieve an ideal relation between image resolution, delay time, power consumption and the demand of logic gates. The video output is transmitted via High-Definition Multimedia Interface (HDMI) with a resolution of 720p at a frame rate of 60 fps conforming to the HD ready standard. Examples of the synthesized refocusing slices are presented

Crossref

University of Bedfordshire Repository

Error Detecting Dual Basis Bit Parallel Systolic Multiplication Architecture over GF(2m)

Author: Bera A.
Mathew J.
Pradhan D.
Rahaman H.
Singh Ashutosh Kumar
Publication venue: University of Electronic Science and Technology
Publication date: 01/01/2009
Field of study

An error tolerant hardware efficient very large scale integration (VLSI) architecture for bit parallel systolic multiplication over dual base, which can be pipelined, is presented. Since this architecture has the features of regularity, modularity and unidirectional data flow, this structure is well suited to VLSI implementations. The length of the largest delay path and area of this architecture are less compared to the bit parallel systolic multiplication architectures reported earlier. The architecture is implemented using Austria Micro System's 0.35 m CMOS (complementary metal oxide semiconductor) technology. This architecture can also operate over both the dual-base and polynomial base

espace@Curtin

Group implicit concurrent algorithms in nonlinear structural dynamics

Author: Ortiz M.
Sotelino E. D.
Publication venue
Publication date
Field of study

During the 70's and 80's, considerable effort was devoted to developing efficient and reliable time stepping procedures for transient structural analysis. Mathematically, the equations governing this type of problems are generally stiff, i.e., they exhibit a wide spectrum in the linear range. The algorithms best suited to this type of applications are those which accurately integrate the low frequency content of the response without necessitating the resolution of the high frequency modes. This means that the algorithms must be unconditionally stable, which in turn rules out explicit integration. The most exciting possibility in the algorithms development area in recent years has been the advent of parallel computers with multiprocessing capabilities. So, this work is mainly concerned with the development of parallel algorithms in the area of structural dynamics. A primary objective is to devise unconditionally stable and accurate time stepping procedures which lend themselves to an efficient implementation in concurrent machines. Some features of the new computer architecture are summarized. A brief survey of current efforts in the area is presented. A new class of concurrent procedures, or Group Implicit algorithms is introduced and analyzed. The numerical simulation shows that GI algorithms hold considerable promise for application in coarse grain as well as medium grain parallel computers

NASA Technical Reports Server

A VLSI synthesis of a Reed-Solomon processor for digital communication systems

Author: Chose Philemon John
Publication venue: Memorial University of Newfoundland
Publication date: 01/01/1998
Field of study

The Reed-Solomon codes have been widely used in digital communication systems such as computer networks, satellites, VCRs, mobile communications and high- definition television (HDTV), in order to protect digital data against erasures, random and burst errors during transmission. Since the encoding and decoding algorithms for such codes are computationally intensive, special purpose hardware implementations are often required to meet the real time requirements. -- One motivation for this thesis is to investigate and introduce reconfigurable Galois field arithmetic structures which exploit the symmetric properties of available architectures. Another is to design and implement an RS encoder/decoder ASIC which can support a wide family of RS codes. -- An m-programmable Galois field multiplier which uses the standard basis representation of the elements is first introduced. It is then demonstrated that the exponentiator can be used to implement a fast inverter which outperforms the available inverters in GF(2m). Using these basic structures, an ASIC design and synthesis of a reconfigurable Reed-Solomon encoder/decoder processor which implements a large family of RS codes is proposed. The design is parameterized in terms of the block length n, Galois field symbol size m, and error correction capability t for the various RS codes. The design has been captured using the VHDL hardware description language and mapped onto CMOS standard cells available in the 0.8-µm BiCMOS design kits for Cadence and Synopsys tools. The experimental chip contains 218,206 logic gates and supports values of the Galois field symbol size m = 3,4,5,6,7,8 and error correction capability t = 1,2,3, ..., 16. Thus, the block length n is variable from 7 to 255. Error correction t and Galois field symbol size m are pin-selectable. -- Since low design complexity and high throughput are desired in the VLSI chip, the algebraic decoding technique has been investigated instead of the time or transform domain. The encoder uses a self-reciprocal generator polynomial which structures the codewords in a systematic form. At the beginning of the decoding process, received words are initially stored in the first-in-first-out (FIFO) buffer as they enter the syndrome module. The Berlekemp-Massey algorithm is used to determine both the error locator and error evaluator polynomials. The Chien Search and Forney's algorithms operate sequentially to solve for the error locations and error values respectively. The error values are exclusive or-ed with the buffered messages in order to correct the errors, as the processed data leave the chip

Memorial University Research Repository

An FPGA Implementation of a Montgomery Multiplier Over GF(2^m)

Author: Mentens Nele
Ors Siddika Berna
Preneel Bart
Vandewalle Joos
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 20/02/2012
Field of study

This paper describes an efficient FPGA implementation for modular multiplication in the finite field GF(2^m) that is suitable for implementing Elliptic Curve Cryptosystems. We have developed a systolic array implementation of a~Montgomery modular multiplication. Our solution is efficient for large finite fields (m=160-193), that offer a high security level, and it can be scaled easily to larger values of m. The clock frequency of the implementation is independent of the field size. In contrast to earlier work, the design is not restricted to field representations using irreducible trinomials, all one polynomials or equally spaced polynomials

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Highly parallel sparse Cholesky factorization

Author: Gilbert John R.
Schreiber Robert
Publication venue
Publication date
Field of study

Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms

NASA Technical Reports Server