Search CORE

358 research outputs found

A VLSI synthesis of a Reed-Solomon processor for digital communication systems

Author: Chose Philemon John
Publication venue: Memorial University of Newfoundland
Publication date: 01/01/1998
Field of study

The Reed-Solomon codes have been widely used in digital communication systems such as computer networks, satellites, VCRs, mobile communications and high- definition television (HDTV), in order to protect digital data against erasures, random and burst errors during transmission. Since the encoding and decoding algorithms for such codes are computationally intensive, special purpose hardware implementations are often required to meet the real time requirements. -- One motivation for this thesis is to investigate and introduce reconfigurable Galois field arithmetic structures which exploit the symmetric properties of available architectures. Another is to design and implement an RS encoder/decoder ASIC which can support a wide family of RS codes. -- An m-programmable Galois field multiplier which uses the standard basis representation of the elements is first introduced. It is then demonstrated that the exponentiator can be used to implement a fast inverter which outperforms the available inverters in GF(2m). Using these basic structures, an ASIC design and synthesis of a reconfigurable Reed-Solomon encoder/decoder processor which implements a large family of RS codes is proposed. The design is parameterized in terms of the block length n, Galois field symbol size m, and error correction capability t for the various RS codes. The design has been captured using the VHDL hardware description language and mapped onto CMOS standard cells available in the 0.8-µm BiCMOS design kits for Cadence and Synopsys tools. The experimental chip contains 218,206 logic gates and supports values of the Galois field symbol size m = 3,4,5,6,7,8 and error correction capability t = 1,2,3, ..., 16. Thus, the block length n is variable from 7 to 255. Error correction t and Galois field symbol size m are pin-selectable. -- Since low design complexity and high throughput are desired in the VLSI chip, the algebraic decoding technique has been investigated instead of the time or transform domain. The encoder uses a self-reciprocal generator polynomial which structures the codewords in a systematic form. At the beginning of the decoding process, received words are initially stored in the first-in-first-out (FIFO) buffer as they enter the syndrome module. The Berlekemp-Massey algorithm is used to determine both the error locator and error evaluator polynomials. The Chien Search and Forney's algorithms operate sequentially to solve for the error locations and error values respectively. The error values are exclusive or-ed with the buffered messages in order to correct the errors, as the processed data leave the chip

Memorial University Research Repository

Area-Optimized Fully-Flexible BCH Decoder for Multiple GF Dimensions

Author: Lee Youngjoo
Park Byeonggil
Park Jongsun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Recently, there are increasing demands for fully flexible Bose Chaudhuri Hocquenghem (BCH) decoders, which can support different dimensions of Galois fields (GF) operations. As the previous BCH decoders are mainly targeting the fixed GF operations, the conventional techniques are no longer suitable for multiple GF dimensions. For the area-optimized flexible BCH decoders, in this paper, we present several optimization schemes for reducing hardware costs of multi-dimensional GF operations. In the proposed optimizations, we first reformulate the matrix operations in syndrome calculation and Chien search for sharing more common sub-expressions between GF operations having different dimensions. The cell based multi-m GF multiplier is newly introduced for the area-efficient flexible key-equation solver. As case studies, we design several prototype flexible BCH decoders for digital video broadcasting systems and NAND flash memory controllers managing different page sizes. The implementation results show that the proposed fully-flexible BCH decoder architecture remarkably enhances the area-efficiency compared with the conventional solutions.112Ysciescopu

Crossref

포항공과대학교

Towards a triple mode common operator FFT for Software Radio systems

Author: Al Ghouwayel Ali
El-Bazzal Zouhair
Haj-Ali Amin
Louët Yves
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2012
Field of study

International audienceA scenario to design a Triple Mode FFT is addressed. Based on a Dual Mode FFT structure, we present a methodology to reach a triple mode FFT operator (TMFFT) able to operate over three different fields: complex number domain C, Galois Fields GF(Ft) and GF(2m). We propose a reconfigurable Triple mode Multiplier that constitutes the core of the Butterflybased FFT. A scalable and flexible unit for the polynomial reduction needed in the GF(2m) multiplication is also proposed. An FPGA implementation of the proposed multiplier is given and the measures show a gain of 18%in terms of performance-to-cost ratio compared to a "Velcro" approach where two self-contained operators are implemented separately

Synthesis Optimization on Galois-Field Based Arithmetic Operators for Rijndael Cipher

Author: Mursanto Petrus
Publication venue: LPPM ITBis Lembah Dempo
Publication date: 01/08/2011
Field of study

A series of experiments has been conducted to show that FPGA synthesis of Galois-Field (GF) based arithmetic operators can be optimized automatically to improve Rijndael Cipher throughput. Moreover, it has been demonstrated that efficiency improvement in GF operators does not directly correspond to the system performance at application level. The experiments were motivated by so many research works that focused on improving performance of GF operators. Each of the variants has the most efficient form in either time (fastest) or space (smallest occupied area) when implemented in FPGA chips. In fact, GF operators are not utilized individually, but rather integrated one to the others to implement algorithms. Contribution of this paper is to raise issue on GF-based application performance and suggest alternative aspects that potentially affect it. Instead of focusing on GF operator efficiency, system characteristics are worth considered in optimizing application performance

CiteSeerX

Journal of ICT Research and Applications

Directory of Open Access Journals

ITB Journal

Synthesis Optimization on Galois-Field Based Arithmetic Operators for Rijndael Cipher

Author
Publication venue: 'The Institute for Research and Community Services (LPPM) ITB'
Publication date: 01/01/2011
Field of study

Crossref

Speeding up a scalable modular inversion hardware architecture

Author: Gutub AA-A
Kalganova T
Publication venue: KFUPM
Publication date: 01/01/2005
Field of study

The modular inversion is a fundamental process in several cryptographic systems. It can be computed in software or hardware, but hardware computation proven to be faster and more secure. This research focused on improving an old scalable inversion hardware architecture proposed in 2004 for finite field GF(p). The architecture has been made of two parts, a computing unit and a memory unit. The memory unit is to hold all the data bits of computation whereas the computing unit performs all the arithmetic operations in word (digit) by word bases known as scalable method. The main objective of this project was to investigate the cost and benefit of modifying the memory unit to include parallel shifting, which was one of the tasks of the scalable computing unit. The study included remodeling the entire hardware architecture removing the shifter from the scalable computing part embedding it in the memory unit instead. This modification resulted in a speedup to the complete inversion process with an area increase due to the new memory shifting unit. Quantitative measurements of the speed area trade-off have been investigated. The results showed that the extra hardware to be added for this modification compared to the speedup gained, giving the user the complete picture to choose from depending on the application need.the British council in Saudi Arabia, KFUPM, Dr. Tatiana Kalganova at the Electrical & Computer Engineering Department of Brunel University in Uxbridg

CiteSeerX

Brunel University Research Archive

An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics

Author: Benini Luca
Conti Francesco
Gautschi Michael
Gürkaynak Frank Kagan
Haugou Germain
Loi Igor
Mangard Stefan
Muehlberghuber Michael
Pullini Antonio
Rossi Davide
Schiavone Pasquale Davide
Schilling Robert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/04/2017
Field of study

Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data security issues. To cope with the combined workload of analytics and encryption in a tight power envelope, we propose Fulmine, a System-on-Chip based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks. The Fulmine SoC, fabricated in 65nm technology, consumes less than 20mW on average at 0.8V achieving an efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to 25MIPS/mW in software. As a strong argument for real-life flexible application of our platform, we show experimental results for three secure analytics use cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with secured remote recognition in 5.74pJ/op; and seizure detection with encrypted data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE Transactions on Circuits and Systems - I: Regular Paper

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A common operator for FFT and FEC decoding

Author: Alaus Laurent
Louët Yves
Naoues Malek
Noguet Dominique
Publication venue: 'Elsevier BV'
Publication date: 01/11/2011
Field of study

International audienceIn the Software Radio context, the parametrization is becoming an important topic especially when it comes to multistandard designs. This paper capitalizes on the Common Operator technique to present new common structures for the FFT and FEC decoding algorithms. A key benefit of exhibiting common operators is the regular architecture it brings when implemented in a Common Operator Bank (COB). This regularity makes the architecture open to future function mapping and adapted to accommodated silicon technology variability through dependable design

Hardware Implementation of Bit-Parallel Finite Field Multipliers Based on Overlap-free Algorithm on FPGA

Author: Pan Meitong
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2019
Field of study

Cryptography can be divided into two fundamentally different classes: symmetric-key and public-key. Compared with symmetric-key cryptography, where the complexity of the security system relies on a single key between receiver and sender, public-key cryptographic system using two separate but mathematically related keys. Finite field multiplication is a key operation used in all cryptographic systems relied on finite field arithmetic as it not only is computationally complex but also one of the most frequently used finite field operations. Karatsuba algorithm and its generalization are most often used to construct multiplication architectures with significantly improved in these decades. However, one of its optimized architecture called Overlap-free Karatsuba algorithm has been mentioned by fewer people and even its implementation on FPGA has not been mentioned by anyone. After completion of a detailed study of this specific algorithm, this thesis has proposed implementation of modified Overlap-free Karatsuba algorithm on Xilinx Spartan-605. Applied this algorithm and its specific architecture, reduced gates or shorten critical path will be achieved for the given value of n.Optimized multiplication architecture, generated from proposed modified Overlap-free Karatsuba algorithm and applied on FPGA board,over NIST recommended fields (n = 128), are presented and analysed in detail. Compared with existing works with sub-quadratic space and time complexities, the proposed modified algorithm is highly recommended module and have improved on both space and time complexities. At last, generalization of proposed modified algorithm is suitable for much larger size of finite fields, and improvements of FPGA itself have been discussed

Scholarship at UWindsor

Variable-Rate VLSI Architecture for 400-Gb/s Hard-Decision Product Decoder

Author: Fougstedt Christoffer
Jain Vikram
Larsson-Edefors Per
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Variable-rate transceivers, which adapt to the conditions, will be central to energy-efficient communication. However, fiber-optic communication systems with high bit-rate requirements make design of flexible transceivers challenging, since additional circuits needed to orchestrate the flexibility will increase area and degrade speed. We propose a variable-rate VLSI architecture of a forward error correction (FEC) decoder based on hard-decision product codes. Variable shortening of component codes provides a mechanism by which code rate can be varied, the number of iterations offers a knob to control the coding gain, while a key-equation solver module that can swap between error-locator polynomial coefficients provides a means to change error correction capability. Our evaluations based on 28-nm netlists show that a variable-rate decoder implementation can offer a net coding gain (NCG) range of 9.96-10.38 dB at a post-FEC bit-error rate of 10^-15. The decoder achieves throughputs in excess of 400 Gb/s, latencies below 53 ns, and energy efficiencies of 1.14 pJ/bit or less. While the area of the variable-rate decoder is 31% larger than a decoder with a fixed rate, the power dissipation is a mere 5% higher. The variable error correction capability feature increases the NCG range further, to above 10.5 dB, but at a significant area cost

Chalmers Research