Search CORE

8 research outputs found

FPGA Implementation of Double Precision Floating Point Multiplier

Author: Abdullah Mohd.
Chourasia Bharti
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2022
Field of study

High speed computation is the need of today’s generation of Processors.  To accomplish this major task,  many functions  are implemented  inside the hardware  of the processor rather than  having  software  computing  the  same  task. Majority of the operations which the processor executes are Arithmetic operations which are widely used in many applications that require heavy mathematical operations such as scientific calculations, image and signal processing. Especially in the field of signal processing, multiplication division operation is widely used in many applications. The major issue with these operations in hardware is that much iteration is required which results in slow operation while fast algorithms require complex computations within each cycle. The result of a Division operation results in a either  in Quotient  and  Remainder  or a Floating  point  number  which is the  major reason  to  make it  more complex than  Multiplication  operation

International Journal on Recent and Innovation Trends in Computing and Communication

Implementation of 8 Point FFT with IEEE 754 Floating Format for OFDM System

Author: Yogesh Sharma, Preeti Mankar, Yogesh Gaidhane, Atul Borkar
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/07/2014
Field of study

No Abstrac

International Journal on Recent and Innovation Trends in Computing and Communication

Fault Tolerance in Reversible Logic Circuits and Quantum Cost Optimization

Author: Arunachalam Kamaraj
Perumalsamy Marichamy
Ponnusamy Kaviyashri K.
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 25/03/2021
Field of study

Energy dissipation is a prominent factor for the very large scale integrated circuit (VLSI). The reversible logic-based circuit was capable to compute the logic without energy dissipation. Accordingly, reversible circuits are an emerging domain of research based on the low value of energy dissipation. At nano-level design, the critical factor in the logic computing paradigm is the fault. The proposed methodology of fault coverage is powerful for testability. In this article, we target three factors such as fault tolerance, fault coverage and fault detection in the reversible KMD Gates. Our analysis provides good evidence that the minimum test vector covers the 100 % fault coverage and 50 % fault tolerance in KMD Gate. Further, we show a comparison between the quantum equivalent and controlled V and V+ gate in all the types of KMD Gates. The proposed methodology mentions that after controlled V and V+ gate based ALU, divider and Vedic multiplier have a significant reduction in quantum cost. The comparative results of designs such as Vedic multiplier, division unit and ALU are obtained and they are analyzed showing significant improvement in quantum cost

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Design of efficient reversible floating-point arithmetic unit on field programmable gate array platform and its performance analysis

Author: Bhandari Gajanan Sangeetha
Sanjeevaiah Girija
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/02/2023
Field of study

The reversible logic gates are used to improve the power dissipation in modern computer applications. The floating-point numbers with reversible features are added advantage to performing complex algorithms with high-performance computations. This manuscript implements an efficient reversible floating-point arithmetic (RFPA) unit, and its performance metrics are realized in detail. The RFP adder/subtractor (A/S), RFP multiplier, and RFP divider units are designed as a part of the RFP arithmetic unit. The RFPA unit is designed by considering basic reversible gates. The mantissa part of the RFP multiplier is created using a 24x24 Wallace tree multiplier. In contrast, the reciprocal unit of the RFP divider is designed using Newton Raphson’s method. The RFPA unit and its submodules are executed in parallel by utilizing one clock cycle individually. The RFPA unit and its submodules are synthesized separately on the Vivado IDE environment and obtained the implementation results on Artix-7 field programmable gate array (FPGA). The RFPA unit utilizes only 18.44% slice look-up tables (LUTs) by consuming the 0.891 W total power on Artix-7 FPGA. The RFPA unit sub-models are compared with existing approaches with better performance metrics and chip resource utilization improvements

ZENODO

Institute of Advanced Engineering and Science

FPGA Implementation of Fast Binary Multiplication Based on Customized Basic Cells

Author: Abd Al-Rahman Al-Nounou
Fadi Obeidat
Mohammad Al-Khaleel
Osama Al-Khaleel
Publication venue: 'Pensoft Publishers'
Publication date: 01/01/2022
Field of study

Multiplication is considered one of the most time-consuming and a key operation in wide variety of embedded applications. Speeding up this operation has a significant impact on the overall performance of these applications. A vast number of multiplication approaches are found in the literature where the goal is always to achieve a higher performance. One of these approaches relies on using smaller multiplier blocks which are built based on direct Boolean algebra equations to build large multipliers. In this work, we present a methodology for designing binary multipliers where different sizes customized partial products generation (CPPG) cells are designed and used as smaller building blocks. The sizes of the designed CPPG cells are 2&times;2, 3&times;3, 4&times;4, 5&times;5, and 6&times;6. We use these cells to build 8&times;8, 16&times;16, 32&times;32, 64&times;64, and 128&times;128 binary multipliers. All of the CPPG cells and the binary multipliers are described using the VHDL language, tested, and implemented using XILINX ISE 14.6 tools targeting different FPGA families. The implementation results show that the best performance is achieved when cell 3&times;3 is used and Virtex-7 FPGA is targeted. The binary multipliers that are designed using the proposed CPPG cells achieve better performance when compared with the binary multipliers presented in the literature. As an application that utilizes the proposed multiplier, a Multiply-Accumulate (MAC) unit is designed and implemented in Spartan-3E. The implementation results of the MAC unit demonstrate the effectiveness of the proposed multiplier

ZENODO

Directory of Open Access Journals

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

ARPHA OAI-PMH Endpoint

ARPHA Preprints

Null convention logic circuits for asynchronous computer architecture

Author: Kim M
Publication venue: RMIT University
Publication date
Field of study

For most of its history, computer architecture has been able to benefit from a rapid scaling in semiconductor technology, resulting in continuous improvements to CPU design. During that period, synchronous logic has dominated because of its inherent ease of design and abundant tools. However, with the scaling of semiconductor processes into deep sub-micron and then to nano-scale dimensions, computer architecture is hitting a number of roadblocks such as high power and increased process variability. Asynchronous techniques can potentially offer many advantages compared to conventional synchronous design, including average case vs. worse case performance, robustness in the face of process and operating point variability and the ready availability of high performance, fine grained pipeline architectures. Of the many alternative approaches to asynchronous design, Null Convention Logic (NCL) has the advantage that its quasi delay-insensitive behavior makes it relatively easy to set up complex circuits without the need for exhaustive timing analysis. This thesis examines the characteristics of an NCL based asynchronous RISC-V CPU and analyses the problems with applying NCL to CPU design. While a number of university and industry groups have previously developed small 8-bit microprocessor architectures using NCL techniques, it is still unclear whether these offer any real advantages over conventional synchronous design. A key objective of this work has been to analyse the impact of larger word widths and more complex architectures on NCL CPU implementations. The research commenced by re-evaluating existing techniques for implementing NCL on programmable devices such as FPGAs. The little work that has been undertaken previously on FPGA implementations of asynchronous logic has been inconclusive and seems to indicate that asynchronous systems cannot be easily implemented in these devices. However, most of this work related to an alternative technique called bundled data, which is not well suited to FPGA implementation because of the difficulty in controlling and matching delays in a 'bundle' of signals. On the other hand, this thesis clearly shows that such applications are not only possible with NCL, but there are some distinct advantages in being able to prototype complex asynchronous systems in a field-programmable technology such as the FPGA. A large part of the value of NCL derives from its architectural level behavior, inherent pipelining, and optimization opportunities such as the merging of register and combina- tional logic functions. In this work, a number of NCL multiplier architectures have been analyzed to reveal the performance trade-offs between various non-pipelined, 1D and 2D organizations. Two-dimensional pipelining can easily be applied to regular architectures such as array multipliers in a way that is both high performance and area-efficient. It was found that the performance of 2D pipelining for small networks such as multipliers is around 260% faster than the equivalent non-pipelined design. However, the design uses 265% more transistors so the methodology is mainly of benefit where performance is strongly favored over area. A pipelined 32bit x 32bit signed Baugh-Wooley multiplier with Wallace-Tree Carry Save Adders (CSA), which is representative of a real design used for CPUs and DSPs, was used to further explore this concept as it is faster and has fewer pipeline stages compared to the normal array multiplier using Ripple-Carry adders (RCA). It was found that 1D pipelining with ripple-carry chains is an efficient implementation option but becomes less so for larger multipliers, due to the completion logic for which the delay time depends largely on the number of bits involved in the completion network. The average-case performance of ripple-carry adders was explored using random input vectors and it was observed that it offers little advantage on the smaller multiplier blocks, but this particular timing characteristic of asynchronous design styles be- comes increasingly more important as word size grows. Finally, this research has resulted in the development of the first 32-Bit asynchronous RISC-V CPU core. Called the Redback RISC, the architecture is a structure of pipeline rings composed of computational oscillations linked with flow completeness relationships. It has been written using NELL, a commercial description/synthesis tool that outputs standard Verilog. The Redback has been analysed and compared to two approximately equivalent industry standard 32-Bit synchronous RISC-V cores (PicoRV32 and Rocket) that are already fabricated and used in industry. While the NCL implementation is larger than both commercial cores it has similar performance and lower power compared to the PicoRV32. The implementation results were also compared against an existing NCL design tool flow (UNCLE), which showed how much the results of these implementation strategies differ. The Redback RISC has achieved similar level of throughput and 43% better power and 34% better energy compared to one of the synchronous cores with the same benchmark test and test condition such as input sup- ply voltage. However, it was shown that area is the biggest drawback for NCL CPU design. The core is roughly 2.5&times; larger than synchronous designs. On the other hand its area is still 2.9&times; smaller than previous designs using UNCLE tools. The area penalty is largely due to the unavoidable translation into a dual-rail topology when using the standard NCL cell library

RMIT Research Repository

Low power single precision BCD floating–point Vedic multiplier

Author: Anjana
Anjana
Ashwath
Bansal
Bhavesh
Bisoyi
Gonzalez-Navarro
Gowreesrinivas
Gupta
Havaldar
Jaberipur
Jais
Liu
Mahakalkar
Mehta
Mittal
Patil
Ramalatha
Saokar
Seo
Tripathy
Vazquez
Veeramachaneni
Wang
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Water productivity indices of the soybean grown on silty clay soil under sprinkler irrigation

Author: Gajić Boško
Kresović Branka
Tapanarova Angelina
Životić Ljubomir
Publication venue: East Sarajevo: Faculty of Agriculture
Publication date: 01/01/2017
Field of study

The objective of this research was to compare the effects of different irrigation treatments on soybean [Glycine max (L.) Merr.] productivity and water use efficiency on experimental fields of the Maize Research Institute of Zemun Polje(Serbia), in 2007 and 2008. Four irrigation levels were investigated: full irrigation (I100), 65% and 40% of I100 (I65 and I40) and a rain-fed (I0) system. The crop water use efficiency (CWUE, also known as crop water productivity –CWP), irrigation water use efficiency (IWUE) and evapotranspiration water use efficiency (ETWUE) were used to assess the water productivity of each studied treatment. The efficiency of the same treatment differed between the years as it depended on seasonal water availability, weather conditions and their impact on seed yields. Maximum and minimum yields were obtained in the I65 and I0 treatments, averaging 3.41 t ha–1 and 2.26 t ha–1, respectively. Water use efficiency values were influenced by the irrigation levels. In general, CWUE values increased with the increased level of irrigation. In both growing seasons, IWUE and ETWUE decreased with increasing the seasonal water consumption and irrigation depth. On average, treatments I40 and I65 resulted in similar or higher CWUE and ETWUE than I100, in both growing seasons. I65 resulted in the highest IWUE, averaged over the two seasons, while I100 had the lowest IWUE. I65 could be proper for the soybean irrigated in Vojvodina when there is no water shortage and I45 could be used as a good basis for reduced sprinkler irrigation strategy development under water shortage

AgroSpace