553 research outputs found

    An Enhanced Hardware Description Language Implementation for Improved Design-Space Exploration in High-Energy Physics Hardware Design

    Get PDF
    Detectors in High-Energy Physics (HEP) have increased tremendously in accuracy, speed and integration. Consequently HEP experiments are confronted with an immense amount of data to be read out, processed and stored. Originally low-level processing has been accomplished in hardware, while more elaborate algorithms have been executed on large computing farms. Field-Programmable Gate Arrays (FPGAs) meet HEP's need for ever higher real-time processing performance by providing programmable yet fast digital logic resources. With the fast move from HEP Digital Signal Processing (DSPing) applications into the domain of FPGAs, related design tools are crucial to realise the potential performance gains. This work reviews Hardware Description Languages (HDLs) in respect to the special needs present in the HEP digital hardware design process. It is especially concerned with the question, how features outside the scope of mainstream digital hardware design can be implemented efficiently into HDLs. It will argue that functional languages are especially suitable for implementation of domain-specific languages, including HDLs. Casestudies examining the implementation complexity of HEP-specific language extensions to the functional HDCaml HDL will prove the viability of the suggested approach

    A novel deep submicron bulk planar sizing strategy for low energy subthreshold standard cell libraries

    Get PDF
    Engineering andPhysical Science ResearchCouncil (EPSRC) and Arm Ltd for providing funding in the form of grants and studentshipsThis work investigates bulk planar deep submicron semiconductor physics in an attempt to improve standard cell libraries aimed at operation in the subthreshold regime and in Ultra Wide Dynamic Voltage Scaling schemes. The current state of research in the field is examined, with particular emphasis on how subthreshold physical effects degrade robustness, variability and performance. How prevalent these physical effects are in a commercial 65nm library is then investigated by extensive modeling of a BSIM4.5 compact model. Three distinct sizing strategies emerge, cells of each strategy are laid out and post-layout parasitically extracted models simulated to determine the advantages/disadvantages of each. Full custom ring oscillators are designed and manufactured. Measured results reveal a close correlation with the simulated results, with frequency improvements of up to 2.75X/2.43X obs erved for RVT/LVT devices respectively. The experiment provides the first silicon evidence of the improvement capability of the Inverse Narrow Width Effect over a wide supply voltage range, as well as a mechanism of additional temperature stability in the subthreshold regime. A novel sizing strategy is proposed and pursued to determine whether it is able to produce a superior complex circuit design using a commercial digital synthesis flow. Two 128 bit AES cores are synthesized from the novel sizing strategy and compared against a third AES core synthesized from a state-of-the-art subthreshold standard cell library used by ARM. Results show improvements in energy-per-cycle of up to 27.3% and frequency improvements of up to 10.25X. The novel subthreshold sizing strategy proves superior over a temperature range of 0 °C to 85 °C with a nominal (20 °C) improvement in energy-per-cycle of 24% and frequency improvement of 8.65X. A comparison to prior art is then performed. Valid cases are presented where the proposed sizing strategy would be a candidate to produce superior subthreshold circuits

    Transparent In-Circuit Assertions for FPGAs

    No full text
    Commonly used in software design, assertions are statements placed into a design to ensure that its behaviour matches that expected by a designer. Although assertions apply equally to hardware design, they are typically supported only for logic simulation, and discarded prior to physical implementation. We propose a new HDL-agnostic language for describing latency-insensitive assertions and novel methods to add such assertions transparently to an already placed-and-routed circuit without affecting the existing design. We also describe how this language and associated methods can be used to implement semi-transparent exception handling. The key to our work is that by treating hardware assertions and exceptions as being oblivious or less sensitive to latency, assertion logic need only use spare FPGA resources. We use network-flow techniques to route necessary signals to assertions via spare flip-flops, eliminating any performance degradation, even on large designs (92% of slices in one test). Experimental evaluation shows zero impact on critical-path delay, even on large benchmarks operating above 200MHz, at the cost of a small power penalty

    [Research Pertaining to Physics, Space Sciences, Computer Systems, Information Processing, and Control Systems]

    Get PDF
    Research project reports pertaining to physics, space sciences, computer systems, information processing, and control system

    Ultra low-power fault-tolerant SRAM design in 90nm CMOS technology

    Get PDF
    With the increment of mobile, biomedical and space applications, digital systems with low-power consumption are required. As a main part in digital systems, low-power memories are especially desired. Reducing the power supply voltages to sub-threshold region is one of the effective approaches for ultra low-power applications. However, the reduced Static Noise Margin (SNM) of Static Random Access Memory (SRAM) imposes great challenges to the subthreshold SRAM design. The conventional 6-transistor SRAM cell does not function properly at sub-threshold supply voltage range because it has no enough noise margin for reliable operation. In order to achieve ultra low-power at sub-threshold operation, previous research work has demonstrated that the read and write decoupled scheme is a good solution to the reduced SNM problem. A Dual Interlocked Storage Cell (DICE) based SRAM cell was proposed to eliminate the drawback of conventional DICE cell during read operation. This cell can mitigate the singleevent effects, improve the stability and also maintain the low-power characteristic of subthreshold SRAM, In order to make the proposed SRAM cell work under different power supply voltages from 0.3 V to 0.6 V, an improved replica sense scheme was applied to produce a reference control signal, with which the optimal read time could be achieved. In this thesis, a 2K ~8 bits SRAM test chip was designed, simulated and fabricated in 90nm CMOS technology provided by ST Microelectronics. Simulation results suggest that the operating frequency at VDD = 0.3 V is up to 4.7 MHz with power dissipation 6.0 ƒÊW, while it is 45.5 MHz at VDD = 0.6 V dissipating 140 ƒÊW. However, the area occupied by a single cell is larger than that by conventional SRAM due to additional transistors used. The main contribution of this thesis project is that we proposed a new design that could simultaneously solve the ultra low-power and radiation-tolerance problem in large capacity memory design

    Superconducting Logic Circuits Operating With Reciprocal Magnetic Flux Quanta

    Get PDF
    Complimentary Medal-Oxide Semiconductor (CMOS) technology is expected to soon reach its fundamental limits of operation. The fundamental speed limit of about 4 GHz has already effectively been sidestepped by parallelization. This increases raw processing power but does nothing to improve power dissipation or latency. One approach for increasing computing performance involves using superconducting digital logic circuits. In this thesis I describe a new kind of superconducting logic, invented by Quentin Herr at Northrop Grumman, which uses reciprocal pairs of quantized single magnetic flux pulses to encode classical bits. In Reciprocal Quantum Logic (RQL) the data is encoded in integer units of the magnetic flux quantum. RQL gates operate without the bias resistors of previous superconducting logic families and dissipate several orders of magnitude less power. I demonstrate the basic operation of key RQL gates (AndOr, AnotB, Set/Reset) and show their self-resetting properties. Together, these gates form a universal logic set and provide memory capabilities. Experiments measuring the bit error rate of the AndOr gate extrapolated a minimum BER of 10-480 and a BER of 10-44 with 30% margins on flux biasing. I describe an analytic timing model for RQL gates which demonstrates the self-correcting timing features. From this model I derive equations for the timing behavior and operating limits. Using this timing model I ran simulations to determine correction factions for more accurate predictions at higher frequencies. Using these results, I also develop Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL) models to describe the combinational logic of RQL gates. To test the timing predictions of the timing model, I performed three experiments on Nb/AlOx/Nb circuits at 4.2 K. The first measured the time of output. The second measured the operating margins of the circuit. The third measured the maximum frequency of operation for RQL circuits. Together, these three experiments showed quantitative agreement with the model for the timing output, qualitative agreement with the limits of operation, and a projected speed limit of 50 GHz for the Hypres 4.5 kA/cm2 process. To power RQL circuits I describe a new design for power splitters and combiners which minimize standing waves. I describe a new kind of Wilkinson power splitter which required numerical optimization but proved to be adequate. I experimentally tested two new designs of the power splitter. Both showed less than 10% variation in standing waves between power splitter and combiner, making it adequate for RQL circuits. I also compared these results with the S-parameters of the power network, which also indicated that the design was adequate for RQL circuits. Finally, I tested an 8-bit Kogge-Stone architecture carry-look ahead adder designed using VHDL models. The adder contained 815 Josephson junctions and was fully functional at 6.21 GHz with a latency of 1.25 clock cycles. The adder produced the correct logical output, had a measured optimal operating point within 8% of the optimal simulated operating point, and measured power margins of 1 dB. It operated best at the designed clock amplitude of 0.88Ic and dissipated 0.570 mW of power
    • …
    corecore