86 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationElasticity is a design paradigm in which circuits can tolerate arbitrary latency/delay variations in their computation units as well as communication channels. Creating elastic (both synchronous and asynchronous) designs from clocked designs has potential benefits of increased modularity and robustness to variations. Several transformations have been suggested in the literature and each of these require a handshake control network (examples include synchronous elasticization and desynchronization). Elastic control network area and power overheads may become prohibitive. This dissertation investigates different optimization avenues to reduce these overheads without sacrificing the control network performance. First, an algorithm and a tool, CNG, is introduced that generates a control network with minimal total number of join and fork control steering units. Synchronous Elastic FLow (SELF) is a handshake protocol used over synchronous elastic designs. Comparing to its standard eager implementation (that uses eager forks - EForks), lazy SELF can consume less power and area. However, it typically suff ers from combinational cycles and can have inferior performance in some systems. Hence, lazy SELF has been rarely studied in the literature. This work formally and exhaustively investigates the specifi cations, diff erent implementations, and verifi cation of the lazy SELF protocol. Furthermore, several new and existing lazy designs are mapped to hybrid eager/lazy imple-mentations that retain the performance advantage of the eager design but have power and area advantages of lazy implementations, and are combinational-cycle free. This work also introduces a novel ultra simple fork (USFork) design. The USFork has two advantages over lazy forks: it is composed of simpler logic (just wires) and does not form combinational cycles. The conditions under which an EFork can be replaced by a USFork without any performance loss are formally derived. The last optimization avenue discussed in this dissertation is Elastic Bu er Controller (EBC) merging. In a typical synchronous elastic control network, some EBCs may activate their corresponding latches at similar schedules. This work provides a framework for fi nding and merging such controllers in any control network; including open networks (i.e., when the environment abstract is not available or required to be flexible) as well as networks incorporating variable latency units. Replacing EForks with USForks under some equivalence conditions as well as EBC merging have been fully automated in a tool, HGEN. The impact of this work will help achieve elasticity at a reduced cost. It will broaden the class of circuits that can be elasticized with acceptable overhead (circuits that designers would otherwise nd it too expensive to elasticize). In a MiniMIPS processor case study, comparing to a basic control network implementation, the optimization techniques of this dissertation accumulatively achieve reductions in the control network area, dynamic, and leakage power of 73.2%, 68.6%, and 69.1%, respectively

    On the Semantics of Communicating Hardware Processes and their Translation into LOTOS for the Verification of Asynchronous Circuits with CADP

    Get PDF
    International audienceHardware process calculi, such as CHP (Communicating Hardware Processes), Balsa, or Haste (formerly Tangram), are a natural approach for the description of asynchronous hardware architectures. These calculi are extensions of standard process calculi with particular synchronisation features implemented using handshake protocols. In this article, we first give a structural operational semantics for value-passing CHP. Compared to the existing semantics of CHP defined by translation into Petri nets, our semantics is general enough to handle value-passing CHP with communication channels open to the environment, and is also independent of any particular (2- or 4-phase) handshake protocol used for circuit implementation. We then describe the translation of CHP into the process calculus LOTOS (ISO standard 8807), in order to allow asynchronous hardware architectures expressed in CHP to be verified using the CADP verification toolbox for LOTOS. A translator from CHP to LOTOS has been implemented and successfully used for the compositional verification of two industrial case studies, namely an asynchronous implementation of the DES (Data Encryption Standard) and an asynchronous interconnect of a NoC (Network on Chip)

    Test Quality Analysis and Improvement for an Embedded Asynchronous FIFO

    Full text link
    Embedded First-InFirst-Out (FIFO) memories are increasingly used in many IC designs.We have created a new full-custom embedded FIFO module withasynchronous read and write clocks, which is at least a factor twosmaller and also faster than SRAM-based and standard-cell-basedcounterparts. The detection qualities of the FIFO test for bothhard and weak resistive shorts and opens have been analyzed by anIFA-like method based on analog simulation. The defect coverage ofthe initial FIFO test for shorts in the bit-cell matrix has beenimproved by inclusion of an additional data background andlow-voltage testing; for low-resistant shorts, 100% defect coverageis obtained. The defect coverage for opens has been improved by anew test procedure which includes waitingperiods

    Low power predictable memory and processing architectures

    Get PDF
    Great demand in power optimized devices shows promising economic potential and draws lots of attention in industry and research area. Due to the continuously shrinking CMOS process, not only dynamic power but also static power has emerged as a big concern in power reduction. Other than power optimization, average-case power estimation is quite significant for power budget allocation but also challenging in terms of time and effort. In this thesis, we will introduce a methodology to support modular quantitative analysis in order to estimate average power of circuits, on the basis of two concepts named Random Bag Preserving and Linear Compositionality. It can shorten simulation time and sustain high accuracy, resulting in increasing the feasibility of power estimation of big systems. For power saving, firstly, we take advantages of the low power characteristic of adiabatic logic and asynchronous logic to achieve ultra-low dynamic and static power. We will propose two memory cells, which could run in adiabatic and non-adiabatic mode. About 90% dynamic power can be saved in adiabatic mode when compared to other up-to-date designs. About 90% leakage power is saved. Secondly, a novel logic, named Asynchronous Charge Sharing Logic (ACSL), will be introduced. The realization of completion detection is simplified considerably. Not just the power reduction improvement, ACSL brings another promising feature in average power estimation called data-independency where this characteristic would make power estimation effortless and be meaningful for modular quantitative average case analysis. Finally, a new asynchronous Arithmetic Logic Unit (ALU) with a ripple carry adder implemented using the logically reversible/bidirectional characteristic exhibiting ultra-low power dissipation with sub-threshold region operating point will be presented. The proposed adder is able to operate multi-functionally

    DeSyRe: on-Demand System Reliability

    No full text
    The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

    Microspacecraft and Earth observation: Electrical field (ELF) measurement project

    Get PDF
    The Utah State University space system design project for 1989 to 1990 focuses on the design of a global electrical field sensing system to be deployed in a constellation of microspacecraft. The design includes the selection of the sensor and the design of the spacecraft, the sensor support subsystems, the launch vehicle interface structure, on board data storage and communications subsystems, and associated ground receiving stations. Optimization of satellite orbits and spacecraft attitude are critical to the overall mapping of the electrical field and, thus, are also included in the project. The spacecraft design incorporates a deployable sensor array (5 m booms) into a spinning oblate platform. Data is taken every 0.1 seconds by the electrical field sensors and stored on-board. An omni-directional antenna communicates with a ground station twice per day to down link the stored data. Wrap-around solar cells cover the exterior of the spacecraft to generate power. Nine Pegasus launches may be used to deploy fifty such satellites to orbits with inclinations greater than 45 deg. Piggyback deployment from other launch vehicles such as the DELTA 2 is also examined

    Minimizing and exploiting leakage in VLSI

    Get PDF
    Power consumption of VLSI (Very Large Scale Integrated) circuits has been growing at an alarmingly rapid rate. This increase in power consumption, coupled with the increasing demand for portable/hand-held electronics, has made power consumption a dominant concern in the design of VLSI circuits today. Traditionally dynamic (switching) power has dominated the total power consumption of VLSI circuits. However, due to process scaling trends, leakage power has now become a major component of the total power consumption in VLSI circuits. This dissertation explores techniques to reduce leakage, as well as techniques to exploit leakage currents through the use of sub-threshold circuits. This dissertation consists of two studies. In the first study, techniques to reduce leakage are presented. These include a low leakage ASIC design methodology that uses high VT sleep transistors selectively, a methodology that combines input vector control and circuit modification, and a scheme to find the optimum reverse body bias voltage to minimize leakage. As the minimum feature size of VLSI fabrication processes continues to shrink with each successive process generation (along with the value of supply voltage and therefore the threshold voltage of the devices), leakage currents increase exponentially. Leakage currents are hence seen as a necessary evil in traditional VLSI design methodologies. We present an approach to turn this problem into an opportunity. In the second study in this dissertation, we attempt to exploit leakage currents to perform computation. We use sub-threshold digital circuits and come up with ways to get around some of the pitfalls associated with sub-threshold circuit design. These include a technique that uses body biasing adaptively to compensate for Process, Voltage and Temperature (PVT) variations, a design approach that uses asynchronous micro-pipelined Network of Programmable Logic Arrays (NPLAs) to help improve the throughput of sub-threshold designs, and a method to find the optimum supply voltage that minimizes energy consumption in a circuit

    Design and Implementation of a High-Speed Readout and Control System for a Digital Tracking Calorimeter for proton CT

    Get PDF
    Particle therapy, a non-invasive technique for treating cancer using protons and light ions, has become more and more common. For example, a particle treatment facility is currently being built, in Bergen, Norway. Proton beams deposit a large fraction of their energy at the end of their paths, i.e., the delivered dose can be focused on the tumor, sparing nearby tissue with a low entry and almost no exit dose. A novel imaging modality using protons promises to overcome some limitations of particle therapy and allowing to fully exploit its potential. Being able to position the so-called Bragg peak accurately inside the tumor is a major advantage of charged particles, but incomplete knowledge about a crucial tissue property, the stopping power, limits its precision. A proton CT scanner provides direct information about the stopping power. It has the potential to reduce range uncertainties significantly, but no proton CT system has yet been shown to be suitable for clinical use. The aim of the Bergen proton CT project is to design and build a proton CT scanner that overcomes most of the critical limitations of the currently existing prototypes and which can be operated in clinical settings. A proton CT prototype, the Digital Tracking Calorimeter, is being developed as a range telescope consisting of high-granularity pixel sensors. The prototype is a combined position-sensitive detector and residual energy-range detector which will allow a substantial rate of protons, speeding up the imaging process. The detector is single-sided, meaning that it employs information from the beam delivery system to omit tracker layers in front of the phantom. The detector operates by tracking the charged particles traversing through the detector material behind the phantom. The proton CT prototype will be used to determine the feasibility of using proton CT to increase the dose planning accuracy for particle treatment of cancer cells. The detector is designed as a telescope of 43 layers of sensors, where the two front layers act as the position-sensitive detector providing an accurate vector of each incoming particle. The remaining layers are used to measure the residual energy of each particle by observing in which layer they stop and by using the cluster size in each layer. The Digital Tracking Calorimeter employs the ALPIDE sensor, a monolithic active pixel sensor, each utilizing a 1.2Gb/s data link. Each layer of 18Ă—27 cm consists of 108 ALPIDE sensors, roughly corresponding to the width and height of the head of a grown person. The sensors are connected to intermediary transition boards that route the data and control links to dedicated readout electronics and supply the sensors with power. The readout unit is the main component of both the data acquisition and the detector control system. The power control unit controls the power supply and monitors the current usage of the sensors. Both of these devices are mainly implemented in FPGAs. The main purpose of this work has been to explore and implement possible design solutions for the proton CT electronics, including the front-end, as well as the readout electronics architecture. The resulting architecture is modular, allowing the further scale-up of the system in the future. A major obstacle to the design is the high amount of sensors and the corresponding high-speed data links. Thus, a large emphasis has been on the signal integrity of the front-end electronics and a dynamic phase alignment sampling method of the readout electronics firmware. The readout FPGA employs regular I/O pins for the high-speed data interface, instead of high-speed transceiver pins, which significantly reduces the magnitude of the data acquisition system. A consistent design approach with detailed and systematic verification of the FPGA firmware modules, along with a continuous integration build system, has resulted in a stable and highly adaptive system. Significant effort has been put into the testing of the various system components. This also includes the design and implementation of a set of production test tools for use during the manufacturing of the detector front-end.Doktorgradsavhandlin
    • …
    corecore