1,754 research outputs found

    Hardware Implementation of the GPS authentication

    Get PDF
    In this paper, we explore new area/throughput trade- offs for the Girault, Poupard and Stern authentication protocol (GPS). This authentication protocol was selected in the NESSIE competition and is even part of the standard ISO/IEC 9798. The originality of our work comes from the fact that we exploit a fixed key to increase the throughput. It leads us to implement GPS using the Chapman constant multiplier. This parallel implementation is 40 times faster but 10 times bigger than the reference serial one. We propose to serialize this multiplier to reduce its area at the cost of lower throughput. Our hybrid Chapman's multiplier is 8 times faster but only twice bigger than the reference. Results presented here allow designers to adapt the performance of GPS authentication to their hardware resources. The complete GPS prover side is also integrated in the network stack of the PowWow sensor which contains an Actel IGLOO AGL250 FPGA as a proof of concept.Comment: ReConFig - International Conference on ReConFigurable Computing and FPGAs (2012

    Hardware emulation of stochastic p-bits for invertible logic

    Full text link
    The common feature of nearly all logic and memory devices is that they make use of stable units to represent 0's and 1's. A completely different paradigm is based on three-terminal stochastic units which could be called "p-bits", where the output is a random telegraphic signal continuously fluctuating between 0 and 1 with a tunable mean. p-bits can be interconnected to receive weighted contributions from others in a network, and these weighted contributions can be chosen to not only solve problems of optimization and inference but also to implement precise Boolean functions in an inverted mode. This inverted operation of Boolean gates is particularly striking: They provide inputs consistent to a given output along with unique outputs to a given set of inputs. The existing demonstrations of accurate invertible logic are intriguing, but will these striking properties observed in computer simulations carry over to hardware implementations? This paper uses individual micro controllers to emulate p-bits, and we present results for a 4-bit ripple carry adder with 48 p-bits and a 4-bit multiplier with 46 p-bits working in inverted mode as a factorizer. Our results constitute a first step towards implementing p-bits with nano devices, like stochastic Magnetic Tunnel Junctions

    High accuracy computation with linear analog optical systems: a critical study

    Get PDF
    High accuracy optical processors based on the algorithm of digital multiplication by analog convolution (DMAC) are studied for ultimate performance limitations. Variations of optical processors that perform high accuracy vector-vector inner products are studied in abstract and with specific examples. It is concluded that the use of linear analog optical processors in performing digital computations with DMAC leads to impractical requirements for the accuracy of analog optical systems and the complexity of postprocessing electronics

    A general framework for efficient FPGA implementation of matrix product

    Get PDF
    Original article can be found at: http://www.medjcn.com/ Copyright Softmotor LimitedHigh performance systems are required by the developers for fast processing of computationally intensive applications. Reconfigurable hardware devices in the form of Filed-Programmable Gate Arrays (FPGAs) have been proposed as viable system building blocks in the construction of high performance systems at an economical price. Given the importance and the use of matrix algorithms in scientific computing applications, they seem ideal candidates to harness and exploit the advantages offered by FPGAs. In this paper, a system for matrix algorithm cores generation is described. The system provides a catalog of efficient user-customizable cores, designed for FPGA implementation, ranging in three different matrix algorithm categories: (i) matrix operations, (ii) matrix transforms and (iii) matrix decomposition. The generated core can be either a general purpose or a specific application core. The methodology used in the design and implementation of two specific image processing application cores is presented. The first core is a fully pipelined matrix multiplier for colour space conversion based on distributed arithmetic principles while the second one is a parallel floating-point matrix multiplier designed for 3D affine transformations.Peer reviewe

    Implementation of arithmetic primitives using truly deep submicron technology (TDST)

    Get PDF
    The invention of the transistor in 1947 at Bell Laboratories revolutionised the electronics industry and created a powerful platform for emergence of new industries. The quest to increase the number of devices per chip over the last four decades has resulted in rapid transition from Small-Scale-Integration (SSI) and Large-Scale-lntegration (LSI), through to the Very-Large-Scale-Integration (VLSI) technologies, incorporating approximately 10 to 100 million devices per chip. The next phase in this evolution is the Ultra-Large-Scale-Integration (ULSI) aiming to realise new application domains currently not accessible to CMOS technology. Although technology is continuously evolving to produce smaller systems with minimised power dissipation, the IC industry is facing major challenges due to constraints on power density (W/cm2) and high dynamic (operating) and static (standby) power dissipation. Mobile multimedia communication and optical based technologies have rapidly become a significant area of research and development challenging a variety of technological fronts. The future emergence or 4G (4th Generation) wireless communications networks is further driving this development, requiring increasing levels of media rich content. The processing requirements for capture, conversion, compression, decompression, enhancement and display of higher quality multimedia, place heavy demands on current ULSI systems. This is also apparent for mobile applications and intelligent optical networks where silicon chip area and power dissipation become primary considerations. In addition to the requirements for very low power, compact size and real-time processing, the rapidly evolving nature of telecommunication networks means that flexible soft programmable systems capable of adaptation to support a number of different standards and/or roles become highly desirable. In order to fully realise the capabilities promised by the 4G and supporting intelligent networks, new enabling technologies arc needed to facilitate the next generation of personal communications devices. Most of the current solutions to meet these challenges are based on various implementations of conventional architectures. For decades, silicon has been the main platform of computing, however it is slow, bulky, runs too hot, and is too expensive. Thus, new approaches to architectures, driving multimedia and future telecommunications systems, are needed in order to extend the life cycle of silicon technology. The emergence of Truly Deep Submicron Technology (TDST) and related 3-D interconnection technologies have provided potential alternatives from conventional architectures to 3-D system solutions, through integration of IDST, Vertical Software Mapping and Intelligent Interconnect Technology (IIT). The concept of Soft-Chip Technology (SCT) entails integration of Soft• Processing Circuits with Soft-Configurable Circuits . This concept can effectively manipulate hardware primitives through vertical integration of control and data. Thus the notion of 3-D Soft-Chip emerges as a new design algorithm for content-rich multimedia, telecommunication and intelligent networking system applications. 3•D architectures (design algorithms used suitable for 3-D soft-chip technology), are driven by three factors. The first is development of new device technology (TDST) that can support new architectures with complexities of 100M to 1000M devices. The second is development of advanced wafer bonding techniques such as Indium bump and the more futuristic optical interconnects for 3-D soft-chip mapping. The third is related to improving the performance of silicon CMOS systems as devices continue to scale down in dimensions. One of the fundamental building blocks of any computer system is the arithmetic component. Optimum performance of the system is determined by the efficiency of each individual component, as well as the network as a whole entity. Development of configurable arithmetic primitives is the fundamental focus in 3-D architecture design where functionality can be implemented through soft configurable hardware elements. Therefore the ability to improve the performance capability of a system is of crucial importance for a successful design. Important factors that predict the efficiency of such arithmetic components are: • The propagation delay of the circuit, caused by the gate, diffusion and wire capacitances within !he circuit, minimised through transistor sizing. and • Power dissipation, which is generally based on node transition activity. [2] Although optimum performance of 3-D soft-chip systems is primarily established by the choice of basic primitives such as adders and multipliers, the interconnecting network also has significant degree of influence on !he efficiency of the system. 3-D superposition of devices can decrease interconnect delays by up to 60% compared to a similar planar architecture. This research is based on development and implementation of configurable arithmetic primitives, suitable to the 3-D architecture, and has these foci: • To develop a variety of arithmetic components such as adders and multipliers with particular emphasis on minimum area and compatible with 3-D soft-chip design paradigm. • To explore implementation of configurable distributed primitives for arithmetic processing. This entails optimisation of basic primitives, and using them as part of array processing. In this research the detailed designs of configurable arithmetic primitives are implemented using TDST O.l3µm (130nm) technology, utilising CAD software such as Mentor Graphics and Cadence in Custom design mode, carrying through design, simulation and verification steps

    NEW APPROACH OF SIGNED BINARY NUMBERS MULTIPLICATION AND ITS IMPLEMENTATION ON FPGA

    Get PDF
    This paper proposes a new model of signed binary multiplication. This model is formulated mathematically and can handle four types of binary multipliers: signed positive numbers multiplied by signed positive numbers (SPN-by-SPN); signed positive numbers multiplied by signed negative numbers (SPN-by-SNN); signed negative numbers multiplied by signed positive numbers (SNN-by-SPN); and signed negative numbers multiplied by signed negative numbers (SNN-by-SNN). The proposed model has a low complexity algorithm, is easy to implement in software coding and integrated in a hardware FPGA (Field-Programmable Gate Array), and is more powerful compared to the modified Baugh-Wooley's model

    Performance Analysis of Modified Lifting Based DWT Architecture and FPGA Implementation for Speed and Power

    Get PDF
    Demand for high speed and low power architecture for DWT computation have led to design of novel algorithms and architecture In this paper we design model and implement a hardware efficient high speed and power efficient DWT architecture based on modified lifting scheme algorithm The design is interfaced with SIPO and PISO to reduce the number of I O lines on the FPGA The design is implemented on Spartan III device and is compared with lifting scheme logic The proposed design operates at frequency of 280 MHz and consumes power less than 42 mW The presynthesis and post-synthesis results are verified and suitable test vectors are used in verifying the functionality of the design The design is suitable for real time data processin

    Evaluation of High Speed Hardware Multipliers - Fixed Point and Floating point

    Get PDF
    There is a huge demand in high speed arithmetic blocks, due to increased performance of processing units. For higher frequency clocks of the system, the arithmetic blocks must keep pace with greater requirement of more computational power. Area and speed are usually conflicting constraints so that improving speed results mostly in larger areas. In our research we will try to determine the best solution to this problem by comparing the results of different multipliers. Different sized of two algorithms for high speed hardware multipliers were studied and implemented ie. Parallel multiplier, Bit serial multiplier. The workings of these two multipliers were compared by implementing each of them separately in VHDL. A number of high speed adder designs are developed and algorithm and design of these adders are discussed. The result of this research will help us to choose the better option between serial and parallel multipliers for both fixed point and floating point multipliers to fabricate in different systems. As multipliers form one of the most important components of many systems, analysing different multipliers will help us to frame a better system with area and better speed.DOI:http://dx.doi.org/10.11591/ijece.v3i6.418

    HIDE+: a logic based hardware development environment

    Get PDF

    RADIX-10 PARALLEL DECIMAL MULTIPLIER

    Get PDF
    This paper introduces novel architecture for Radix-10 decimal multiplier. The new generation of highperformance decimal floating-point units (DFUs) is demanding efficient implementations of parallel decimal multiplier. The parallel generation of partial products is performed using signed-digit radix-10 recoding of the multiplier and a simplified set of multiplicand multiples. The reduction of partial products is implemented in a tree structure based on a new algorithm decimal multioperand carry-save addition that uses a unconventional decimal-coded number systems. We further detail these techniques and it significantly improves the area and latency of the previous design, which include: optimized digit recoders, decimal carry-save adders (CSA’s) combining different decimal-coded operands, and carry free adders implemented by special designed bit counters
    • …
    corecore