3,226 research outputs found
Efficient hardware implementations of high throughput SHA-3 candidates keccak, luffa and blue midnight wish for single- and multi-message hashing
In November 2007 NIST announced that it would organize the SHA-3 competition to select a new cryptographic hash function family by 2012. In the selection process, hardware performances of the candidates will play an important role. Our analysis of previously proposed hardware implementations shows that three SHA-3 candidate algorithms can provide superior performance in hardware: Keccak, Luffa and Blue Midnight Wish (BMW). In this paper, we provide efficient and fast hardware implementations of these three algorithms. Considering both single- and multi-message hashing applications with an emphasis on both speed and efficiency, our work presents more comprehensive analysis of their hardware performances by providing different performance figures for different target devices. To our best knowledge, this is the first work that provides a comparative analysis of SHA-3 candidates in multi-message applications. We discover that BMW algorithm can provide much higher throughput than previously reported if used in multi-message hashing. We also show that better utilization of resources can increase speed via different configurations. We implement our designs using Verilog HDL, and map to both ASIC and FPGA devices (Spartan3, Virtex2, and Virtex 4) to give a better comparison with those in the literature. We report total area, maximum frequency, maximum throughput and throughput/area of the designs for all target devices. Given that the selection process for SHA3 is still open; our results will be instrumental to evaluate the hardware performance of the candidates
Transistor-Level Synthesis of Pipeline Analog-to-Digital Converters Using a Design-Space Reduction Algorithm
A novel transistor-level synthesis procedure for pipeline ADCs is presented. This procedure is able to directly map high-level converter specifications onto transistor sizes and biasing conditions. It is based on the combination of behavioral models for performance evaluation, optimization routines to minimize the power and area consumption of the circuit solution, and an algorithm to efficiently constraint the converter design space. This algorithm precludes the cost of lengthy bottom-up verifications and speeds up the synthesis task. The approach is herein demonstrated via the design of a 0.13 ÎŒm CMOS 10 bits@60 MS/s pipeline ADC with energy consumption per conversion of only 0.54 pJ@1 MHz, making it one of the most energy-efficient 10-bit video-rate pipeline ADCs reported to date. The computational cost of this design is of only 25 min of CPU time, and includes the evaluation of 13 different pipeline architectures potentially feasible for the targeted specifications. The optimum design derived from the synthesis procedure has been fine tuned to support PVT variations, laid out together with other auxiliary blocks, and fabricated. The experimental results show a power consumption of 23 [email protected] V and an effective resolution of 9.47-bit@1 MHz. Bearing in mind that no specific power reduction strategy has been applied; the mentioned results confirm the reliability of the proposed approach.Ministerio de Ciencia e InnovaciĂłn TEC2009-08447Junta de AndalucĂa TIC-0281
Recommended from our members
Chippe : a system for constraint driven behavioral synthesis
This report describes the Chippe system, gives some background previous work and describes several sample design runs of the system. Also presented are the sources of the design tradeoffs used by Chippe, and overview of the internal design model, and experiences using the system
Recommended from our members
CHASSIS : a combined hardware selection and scheduling technique for performance driven synthesis
This report describes a new technique that combines the Hardware Scheduling and Component Selection phases for High Level Synthesis. Our technique simultaneously selects components from a given library while it schedules the operations into different control steps. The algorĂthm improves previous work in scheduling because component costs and performance are considered during the scheduling process, enlarging the design search space and resulting in better optimized desĂgns
Adaptive Latency Insensitive Protocols andElastic Circuits with Early Evaluation: A Comparative Analysis
AbstractLatency Insensitive Protocols (LIP) and Elastic Circuits (EC) solve the same problem of rendering a design tolerant to additional latencies caused by wires or computational elements. They are performance-limited by a firing semantics that enforces coherency through a lazy evaluation rule: Computation is enabled if all inputs to a block are simultaneously available. Adaptive LIP's (ALIP) and EC with early evaluation (ECEE) increase the performance by relaxing the evaluation rule: Computation is enabled as soon as the subset of inputs needed at a given time is available. Their difference in terms of implementation and behavior in selected cases justifies the need for the comparative analysis reported in this paper. Results have been obtained through simple examples, a single representative case-study already used in the context of both LIP's and EC and through extensive simulations over a suite of benchmarks
An empirical evaluation of High-Level Synthesis languages and tools for database acceleration
High Level Synthesis (HLS) languages and tools are emerging as the most promising technique to make FPGAs more accessible to software developers. Nevertheless, picking the most suitable HLS for a certain class of algorithms depends on requirements such as area and throughput, as well as on programmer experience. In this paper, we explore the different trade-offs present when using a representative set of HLS tools in the context of Database Management Systems (DBMS) acceleration. More specifically, we conduct an empirical analysis of four representative frameworks (Bluespec SystemVerilog, Altera OpenCL, LegUp and Chisel) that we utilize to accelerate commonly-used database algorithms such as sorting, the median operator, and hash joins. Through our implementation experience and empirical results for database acceleration, we conclude that the selection of the most suitable HLS depends on a set of orthogonal characteristics, which we highlight for each HLS framework.Peer ReviewedPostprint (authorâs final draft
On-Line Instruction-checking in Pipelined Microprocessors
Microprocessors performances have increased by more than five orders of magnitude in the last three decades. As technology scales down, these components become inherently unreliable posing major design and test challenges. This paper proposes an instruction-checking architecture to detect erroneous instruction executions caused by both permanent and transient errors in the internal logic of a microprocessor. Monitoring the correct activation sequence of a set of predefined microprocessor control/status signals allow distinguishing between correctly and not correctly executed instruction
- âŠ