71 research outputs found

    Techniques of Energy-Efficient VLSI Chip Design for High-Performance Computing

    Get PDF
    How to implement quality computing with the limited power budget is the key factor to move very large scale integration (VLSI) chip design forward. This work introduces various techniques of low power VLSI design used for state of art computing. From the viewpoint of power supply, conventional in-chip voltage regulators based on analog blocks bring the large overhead of both power and area to computational chips. Motivated by this, a digital based switchable pin method to dynamically regulate power at low circuit cost has been proposed to make computing to be executed with a stable voltage supply. For one of the widely used and time consuming arithmetic units, multiplier, its operation in logarithmic domain shows an advantageous performance compared to that in binary domain considering computation latency, power and area. However, the introduced conversion error reduces the reliability of the following computation (e.g. multiplication and division.). In this work, a fast calibration method suppressing the conversion error and its VLSI implementation are proposed. The proposed logarithmic converter can be supplied by dc power to achieve fast conversion and clocked power to reduce the power dissipated during conversion. Going out of traditional computation methods and widely used static logic, neuron-like cell is also studied in this work. Using multiple input floating gate (MIFG) metal-oxide semiconductor field-effect transistor (MOSFET) based logic, a 32-bit, 16-operation arithmetic logic unit (ALU) with zipped decoding and a feedback loop is designed. The proposed ALU can reduce the switching power and has a strong driven-in capability due to coupling capacitors compared to static logic based ALU. Besides, recent neural computations bring serious challenges to digital VLSI implementation due to overload matrix multiplications and non-linear functions. An analog VLSI design which is compatible to external digital environment is proposed for the network of long short-term memory (LSTM). The entire analog based network computes much faster and has higher energy efficiency than the digital one

    Energy efficient implementation of multi-phase quasi-adiabatic Cyclic Redundancy Check in near field communication

    Get PDF
    Ultra-low power operation in power-limited portable devices (e.g. cell phone and smartcard) is paramount. Existing conventional CMOS consume high energy. The adiabatic logic technique has the potential of rendering energy efficient operation. In this paper, a multi-phase quasi-adiabatic implementation of 16-bit Cyclic Redundancy Check (CRC) is proposed, compliant with the ISO/IEC-14443 standard for contactless smart cards. In terms of a number of CRC bits, the design is scalable and all generator polynomials and initial load values can be accommodated. The CRC design is used as a vehicle to evaluate a range of adiabatic logic styles and power-clock strategies. The effects of voltage scaling and variations in Process-Voltage-Temperature (PVT) are also investigated providing an insight into the robustness of adiabatic logic styles. PFAL and IECRL designs using a 4-phase power-clock are shown to be both the most energy-efficient and robust designs

    Digital logic circuit design using adiabatic approach

    Get PDF
    A major challenge for the circuit designers nowadays is to meet the demand for low power, especially those used in portable and wearable devices which have limited energy power supply. The reasons of designing low power consumption circuit are to reduce energy usage and minimize dissipation of heat. Adiabatic technique is an attractive approach to obtain power optimization where some of the charge in capacitance can be recycled instead of being dissipated as heat. In this thesis, a methodology for designing sequential adiabatic circuits employing a single-phase power clock was investigated. Initially, methods to simulate dynamic power were analysed by identifying a better and reliable method to simulate adiabatic dynamic power. In addition, a method to validate the output voltage swing was presented. The relationship between voltage swing and power dissipation was analysed. Then, several adiabatic sequential D flip flops (DFF) designs which make use of combinational adiabatic circuit design based on quasi-adiabatic were proposed and suitable types of alternating current power supply which influence dynamic power were analysed and selected. The functionality and performance of the proposed circuits were compared against other adiabatic and traditional Complimentary Metal-Oxide Semiconductor (CMOS) circuits and verified to function up to 1 GHz operating region. Besides the circuits, the layout of the proposed sequential adiabatic design was also produced. All simulations were carried out using 0.25 ^m CMOS technology parameters using Tanner Electronic Design Aided and HSPICE tools. The findings showed that the proposed combinational circuit had less transistor count, lower power dissipation with lower voltage swing as compared to reference adiabatic circuits. Furthermore, the proposed sequential DFF circuit showed 25% less power dissipation compared to traditional CMOS

    IDPAL – A Partially-Adiabatic Energy-Efficient Logic Family: Theory and Applications to Secure Computing

    Get PDF
    Low-power circuits and issues associated with them have gained a significant amount of attention in recent years due to the boom in portable electronic devices. Historically, low-power operation relied heavily on technology scaling and reduced operating voltage, however this trend has been slowing down recently due to the increased power density on chips. This dissertation introduces a new very-low power partially-adiabatic logic family called Input-Decoupled Partially-Adiabatic Logic (IDPAL) with applications in low-power circuits. Experimental results show that IDPAL reduces energy usage by 79% compared to equivalent CMOS implementations and by 25% when compared to the best adiabatic implementation. Experiments ranging from a simple buffer/inverter up to a 32-bit multiplier are explored and result in consistent energy savings, showing that IDPAL could be a viable candidate for a low-power circuit implementation. This work also shows an application of IDPAL to secure low-power circuits against power analysis attacks. It is often assumed that encryption algorithms are perfectly secure against attacks, however, most times attacks using side channels on the hardware implementation of an encryption operation are not investigated. Power analysis attacks are a subset of side channel attacks and can be implemented by measuring the power used by a circuit during an encryption operation in order to obtain secret information from the circuit under attack. Most of the previously proposed solutions for power analysis attacks use a large amount of power and are unsuitable for a low-power application. The almost-equal energy consumption for any given input in an IDPAL circuit suggests that this logic family is a good candidate for securing low-power circuits again power analysis attacks. Experimental results ranging from small circuits to large multipliers are performed and the power-analysis attack resistance of IDPAL is investigated. Results show that IDPAL circuits are not only low-power but also the most secure against power analysis attacks when compared to other adiabatic low-power circuits. Finally, a hybrid adiabatic-CMOS microprocessor design is presented. The proposed microprocessor uses IDPAL for the implementation of circuits with high switching activity (e.g. ALU) and CMOS logic for other circuits (e.g. memory, controller). An adiabatic-CMOS interface for transforming adiabatic signals to square-wave signals is presented and issues associated with a hybrid implementation and their solutions are also discussed

    Adiabatic Approach for Low-Power Passive Near Field Communication Systems

    Get PDF
    This thesis tackles the need of ultra-low power electronics in the power limited passive Near Field Communication (NFC) systems. One of the techniques that has proven the potential of delivering low power operation is the Adiabatic Logic Technique. However, the low power benefits of the adiabatic circuits come with the challenges due to the absence of single opinion on the most energy efficient adiabatic logic family which constitute appropriate trade-offs between computation time, area and complexity based on the circuit and the power-clocking schemes. Therefore, five energy efficient adiabatic logic families working in single-phase, 2-phase and 4-phase power-clocking schemes were chosen. Since flip-flops are the basic building blocks of any sequential circuit and the existing flip-flops are MUX-based (having more transistors) design, therefore a novel single-phase, 2-phase and 4-phase reset based flip-flops were proposed. The performance of the multi-phase adiabatic families was evaluated and compared based on the design examples such as 2-bit ring counter, 3-bit Up-Down counter and 16-bit Cyclic Redundancy Check (CRC) circuit (benchmark circuit) based on ISO 14443-3A standard. Several trade-offs, design rules, and an appropriate range for the supply voltage scaling for multi-phase adiabatic logic are proposed. Furthermore, based on the NFC standard (ISO 14443-3A), data is frequently encoded using Manchester coding technique before transmitting it to the reader. Therefore, if Manchester encoding can be implemented using adiabatic logic technique, energy benefits are expected. However, adiabatic implementation of Manchester encoding presents a challenge. Therefore, a novel method for implementing Manchester encoding using adiabatic logic is proposed overcoming the challenges arising due to the AC power-clock. Other challenges that come with the dynamic nature of the adiabatic gates and the complexity of the 4-phase power-clocking scheme is in synchronizing the power-clock v phases and the time spent in designing, validation and debugging of errors. This requires a specific modelling approach to describe the adiabatic logic behaviour at the higher level of abstraction. However, describing adiabatic logic behaviour using Hardware Description Languages (HDLs) is a challenging problem due to the requirement of modelling the AC power-clock and the dual-rail inputs and outputs. Therefore, a VHDL-based modelling approach for the 4-phase adiabatic logic technique is developed for functional simulation, precise timing analysis and as an improvement over the previously described approaches

    組合せ最適化問題のための測定フィードバック型コヒーレント・イジングマシンの実現と評価

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 合原 一幸, 東京大学教授 岩田 覚, 東京大学准教授 平田 祥人, 東京大学准教授 大西 立顕, 東京大学准教授 鈴木 大慈University of Tokyo(東京大学

    Architectural Solutions for NanoMagnet Logic

    Get PDF
    The successful era of CMOS technology is coming to an end. The limit on minimum fabrication dimensions of transistors and the increasing leakage power hinder the technological scaling that has characterized the last decades. In several different ways, this problem has been addressed changing the architectures implemented in CMOS, adopting parallel processors and thus increasing the throughput at the same operating frequency. However, architectural alternatives cannot be the definitive answer to a continuous increase in performance dictated by Moore’s law. This problem must be addressed from a technological point of view. Several alternative technologies that could substitute CMOS in next years are currently under study. Among them, magnetic technologies such as NanoMagnet Logic (NML) are interesting because they do not dissipate any leakage power. More- over, magnets have memory capability, so it is possible to merge logic and memory in the same device. However, magnetic circuits, and NML in this specific research, have also some important drawbacks that need to be addressed: first, the circuit clock frequency is limited to 100 MHz, to avoid errors in data propagation; second, there is a connection between circuit layout and timing, and in particular, longer wires will have longer latency. These drawbacks are intrinsic to the technology and for this reason they cannot be avoided. The only chance is to limit their impact from an architectural point of view. The first step followed in the research path of this thesis is indeed the choice and optimization of architectures able to deal with the problems of NML. Systolic Ar- rays are identified as an ideal solution for this technology, because they are regular structures with local interconnections that limit the long latency of wires; more- over they are composed of several Processing Elements that work in parallel, thus exploit parallelization to increase throughput (limiting the impact of the low clock frequency). Through the analysis of Systolic Arrays for NML, several possible im- provements have been identified and addressed: 1) it has been defined a rigorous way to increase throughput with interleaving, providing equations that allow to esti- mate the number of operations to be interleaved and the rules to provide inputs; 2) a latency insensitive circuit has been designed, that exploits a data communication protocol between processing elements to avoid data synchronization problems. This feature has been exploited to design a latency insensitive Systolic Array that is able to execute the Floyd-Steinberg dithering algorithm. All the improvements presented in this framework apply to Systolic Arrays implemented in any technology. So, they can also be exploited to increase performance of today’s CMOS parallel circuits. This research path is presented in Chapter 3. While Systolic Arrays are an interesting solution for NML, their usage could be quite limited because they are normally application-specific. The second re- search path addresses this problem. A Reconfigurable Systolic Array is presented, that can be programmed to execute several algorithms. This architecture has been tested implementing many algorithms, including FIR and IIR filters, Discrete Cosine Transform and Matrix Multiplication. This research path is presented in Chapter 4. In common Von Neumann architectures, the logic part of the circuit and the memory one are separated. Today bus communication between logic and memory represents the bottleneck of the system. This problem is addressed presenting Logic- In-Memory (LIM), an architecture where memory elements are merged in logic ones. This research path aims at defining a real LIM architectures. This has been done in two steps. The first step is represented by an architecture composed of three layers: memory, routing and logic. In the second step instead the routing plane is no more present, and its features are inherited by the memory plane. In this solution, a pyramidal memory model is used, where memories near logic elements contain the most probably used data, and other memory layers contain the remaining data and instruction set. This circuit has been tested with odd-even sort algorithms and it has been benchmarked against GPUs and ASIC. This research path is presented in Chapter 5. MagnetoElastic NML (ME-NML) is a technological improvement of the NML principle, proposed by researchers of Politecnico di Torino, where the clock system is based on the induced stretch of a piezoelectric substrate when a voltage is ap- plied to its boundaries. The main advantage of this solution is that it consumes much less power than the classic clock implementation. This technology has not yet been investigated from an architectural point of view and considering complex circuits. In this research field, a standard methodology for the design of ME-NML circuits has been proposed. It is based on a Standard Cell Library and an enhanced VHDL model. The effectiveness of this methodology has been proved designing a Galois Field Multiplier. Moreover the serial-parallel trade-off in ME-NML has been investigated, designing three different solutions for the Multiply and Accumulate structure. This research path is presented in Chapter 6. While ME-NML is an extremely interesting technology, it needs to be combined with other faster technologies to have a real competitive system. Signal interfaces between NML and other technologies (mainly CMOS) have been rarely presented in literature. A mixed-technology multiplexer is designed and presented as the basis for a CMOS to NML interface. The reverse interface (from ME-NML to CMOS) is instead based on a sensing circuit for the Faraday effect: a change in the polarization of a magnet induces an electric field that can be used to generate an input signal for a CMOS circuit. This research path is presented in Chapter 7. The research work presented in this thesis represents a fundamental milestone in the path towards nanotechnologies. The most important achievement is the de- sign and simulation of complex circuits with NML, benchmarking this technology with real application examples. The characterization of a technology considering complex functions is a major step to be performed and that has not yet been ad- dressed in literature for NML. Indeed, only in this way it is possible to intercept in advance any weakness of NanoMagnet Logic that cannot be discovered consid- ering only small circuits. Moreover, the architectural improvements introduced in this thesis, although technology-driven, can be actually applied to any technology. We have demonstrated the advantages that can derive applying them to CMOS cir- cuits. This thesis represents therefore a major step in two directions: the first is the enhancement of NML technology; the second is a general improvement of parallel architectures and the development of the new Logic-In-Memory paradigm

    16-Bit Clocked Adiabatic Logic (CAL) logarithmic signal processor

    No full text
    This paper describes a 16-bit Logarithmic Signal Processor and its implementation using Clocked Adiabatic Logic (CAL). The proposed architectures for Lin2Log and Log2Lin converters are based on a linear interpolation algorithm. The CAL-Logarithmic Signal Processor has been designed using an AMS 0.35 μm CMOS process and consumes an area of 7.3 μm 2 . Spice simulations have shown that the circuit can operate at frequencies up to 250MHz and energy calculation have indicated 211.34 pJ power consumption at the maximum operating frequency

    Predicting power scalability in a reconfigurable platform

    Get PDF
    This thesis focuses on the evolution of digital hardware systems. A reconfigurable platform is proposed and analysed based on thin-body, fully-depleted silicon-on-insulator Schottky-barrier transistors with metal gates and silicide source/drain (TBFDSBSOI). These offer the potential for simplified processing that will allow them to reach ultimate nanoscale gate dimensions. Technology CAD was used to show that the threshold voltage in TBFDSBSOI devices will be controllable by gate potentials that scale down with the channel dimensions while remaining within appropriate gate reliability limits. SPICE simulations determined that the magnitude of the threshold shift predicted by TCAD software would be sufficient to control the logic configuration of a simple, regular array of these TBFDSBSOI transistors as well as to constrain its overall subthreshold power growth. Using these devices, a reconfigurable platform is proposed based on a regular 6-input, 6-output NOR LUT block in which the logic and configuration functions of the array are mapped onto separate gates of the double-gate device. A new analytic model of the relationship between power (P), area (A) and performance (T) has been developed based on a simple VLSI complexity metric of the form ATσ = constant. As σ defines the performance “return” gained as a result of an increase in area, it also represents a bound on the architectural options available in power-scalable digital systems. This analytic model was used to determine that simple computing functions mapped to the reconfigurable platform will exhibit continuous power-area-performance scaling behavior. A number of simple arithmetic circuits were mapped to the array and their delay and subthreshold leakage analysed over a representative range of supply and threshold voltages, thus determining a worse-case range for the device/circuit-level parameters of the model. Finally, an architectural simulation was built in VHDL-AMS. The frequency scaling described by σ, combined with the device/circuit-level parameters predicts the overall power and performance scaling of parallel architectures mapped to the array

    NASA Tech Briefs, January 1993

    Get PDF
    Topics include: Electronic Components and Circuits; Electronic Systems; Physical Sciences; Materials; Computer Programs; Mechanics; Machinery; Fabrication Technology; Mathematics and Information Sciences; Life Sciences
    corecore