Search CORE

1,132 research outputs found

Limits on Fundamental Limits to Computation

Author: Markov Igor L.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

An indispensable part of our lives, computing has also become essential to industries and governments. Steady improvements in computer hardware have been supported by periodic doubling of transistor densities in integrated circuits over the last fifty years. Such Moore scaling now requires increasingly heroic efforts, stimulating research in alternative hardware and stirring controversy. To help evaluate emerging technologies and enrich our understanding of integrated-circuit scaling, we review fundamental limits to computation: in manufacturing, energy, physical space, design and verification effort, and algorithms. To outline what is achievable in principle and in practice, we recall how some limits were circumvented, compare loose and tight limits. We also point out that engineering difficulties encountered by emerging technologies may indicate yet-unknown limits.Comment: 15 pages, 4 figures, 1 tabl

arXiv.org e-Print Archive

CiteSeerX

CAD Tool Design for NCL and MTNCL Asynchronous Circuits

Author: Pillai Vijay Mani
Publication venue: ScholarWorks@UARK
Publication date: 01/08/2013
Field of study

This thesis presents an implementation of a method developed to readily convert Boolean designs into an ultra-low power asynchronous design methodology called MTNCL, which combines multi-threshold CMOS (MTCMOS) with NULL Convention Logic (NCL) systems. MTNCL provides the leakage power advantages of an all high-Vt implementation with a reasonable speed penalty compared to the all low-Vt implementation, and has negligible area overhead. The proposed tool utilizes industry-standard CAD tools. This research also presents an Automated Gate-Level Pipelining with Bit-Wise Completion (AGLPBW) method to maximize throughput of delay-insensitive full-word pipelined NCL circuits. These methods have been integrated into the Mentor Graphics and Synopsis CAD tools, using a C-program, which performs the majority of the computations, such that the method can be easily ported to other CAD tool suites. Both methods have been successfully tested on circuits, including a 4-bit × 4-bit multiplier, an unsigned Booth2 multiplier, and a 4-bit/8-operation arithmetic logic unit (ALU

ScholarWorks@UARK

UARK (University of Arkansas )

Energy Aware Design and Analysis for Synchronous and Asynchronous Circuits

Author: Di Jia
Publication venue: University of Central Florida
Publication date: 01/01/2004
Field of study

Power dissipation has become a major concern for IC designers. Various low power design techniques have been developed for synchronous circuits. Asynchronous circuits, however. have gained more interests recently due to their benefits in lower noise, easy timing control, etc. But few publications on energy reduction techniques for asynchronous logic are available. Power awareness indicates the ability of the system power to scale with changing conditions and quality requirements. Scalability is an important figure-of-merit since it allows the end user to implement operational policy. just like the user of mobile multimedia equipment needs to select between better quality and longer battery operation time. This dissertation discusses power/energy optimization and performs analysis on both synchronous and asynchronous logic. The major contributions of this dissertation include: 1 ) A 2-Dimensional Pipeline Gating technique for synchronous pipelined circuits to improve their power awareness has been proposed. This technique gates the corresponding clock lines connected to registers in both vertical direction (the data flow direction) and horizontal direction (registers within each pipeline stage) based on current input precision. 2) Two energy reduction techniques, Signal Bypassing & Insertion and Zero Insertion. have been developed for NCL circuits. Both techniques use Nulls to replace redundant Data 0\u27s based on current input precision in order to reduce the switching activity while Signal Bypassing & Insertion is for non-pipelined NCI, circuits and Zero Insertion is for pipelined counterparts. A dynamic active-bit detection scheme is also developed as an expansion. 3) Two energy estimation techniques, Equivalent Inverter Modeling based on Input Mapping in transistor-level and Switching Activity Modeling in gate-level, have been proposed. The former one is for CMOS gates with feedbacks and the latter one is for NCL circuits

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Designing energy-efficient computing systems using equalization and machine learning

Author: Takhirov Zafar
Publication venue
Publication date: 20/02/2018
Field of study

As technology scaling slows down in the nanometer CMOS regime and mobile computing becomes more ubiquitous, designing energy-efficient hardware for mobile systems is becoming increasingly critical and challenging. Although various approaches like near-threshold computing (NTC), aggressive voltage scaling with shadow latches, etc. have been proposed to get the most out of limited battery life, there is still no “silver bullet” to increasing power-performance demands of the mobile systems. Moreover, given that a mobile system could operate in a variety of environmental conditions, like different temperatures, have varying performance requirements, etc., there is a growing need for designing tunable/reconfigurable systems in order to achieve energy-efficient operation. In this work we propose to address the energy- efficiency problem of mobile systems using two different approaches: circuit tunability and distributed adaptive algorithms. Inspired by the communication systems, we developed feedback equalization based digital logic that changes the threshold of its gates based on the input pattern. We showed that feedback equalization in static complementary CMOS logic enabled up to 20% reduction in energy dissipation while maintaining the performance metrics. We also achieved 30% reduction in energy dissipation for pass-transistor digital logic (PTL) with equalization while maintaining performance. In addition, we proposed a mechanism that leverages feedback equalization techniques to achieve near optimal operation of static complementary CMOS logic blocks over the entire voltage range from near threshold supply voltage to nominal supply voltage. Using energy-delay product (EDP) as a metric we analyzed the use of the feedback equalizer as part of various sequential computational blocks. Our analysis shows that for near-threshold voltage operation, when equalization was used, we can improve the operating frequency by up to 30%, while the energy increase was less than 15%, with an overall EDP reduction of ≈10%. We also observe an EDP reduction of close to 5% across entire above-threshold voltage range. On the distributed adaptive algorithm front, we explored energy-efficient hardware implementation of machine learning algorithms. We proposed an adaptive classifier that leverages the wide variability in data complexity to enable energy-efficient data classification operations for mobile systems. Our approach takes advantage of varying classification hardness across data to dynamically allocate resources and improve energy efficiency. On average, our adaptive classifier is ≈100× more energy efficient but has ≈1% higher error rate than a complex radial basis function classifier and is ≈10× less energy efficient but has ≈40% lower error rate than a simple linear classifier across a wide range of classification data sets. We also developed a field of groves (FoG) implementation of random forests (RF) that achieves an accuracy comparable to Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) under tight energy budgets. The FoG architecture takes advantage of the fact that in random forests a small portion of the weak classifiers (decision trees) might be sufficient to achieve high statistical performance. By dividing the random forest into smaller forests (Groves), and conditionally executing the rest of the forest, FoG is able to achieve much higher energy efficiency levels for comparable error rates. We also take advantage of the distributed nature of the FoG to achieve high level of parallelism. Our evaluation shows that at maximum achievable accuracies FoG consumes ≈1.48×, ≈24×, ≈2.5×, and ≈34.7× lower energy per classification compared to conventional RF, SVM-RBF , Multi-Layer Perceptron Network (MLP), and CNN, respectively. FoG is 6.5× less energy efficient than SVM-LR, but achieves 18% higher accuracy on average across all considered datasets

Boston University Institutional Repository (OpenBU)

Estimation of power generation potential of nonwoody biomass species

Author: Mishra Shankar
Publication venue
Publication date: 01/01/2007
Field of study

In view of high energy potentials in non-woody biomass species and an increasing interest in their utilization for power generation, an attempt has been made in this study to assess the proximate analysis and energy content of different components of Sida rhombifolia, Xanthium strumarium, Anisomeles lamiaceae and Eupatorium coelestinum biomass species (both non-woody), and their impact on power generation and land requirement for energy plantations. The net energy content in Sida is the highest. Xanthium biomass species appears to have slightly higher calorific values in its components than those of Anisomeles and Eupatorium. The pattern of variation of calorific value in the components like stump, branch, leaf and bark is not identical for all the presently studied biomass species. In all these studied biomass species, the calorific values of leaves, in general, are after stump and branch. The data for proximate and ultimate analysis of the components of these species are very close to each other and hence it is very difficult to draw a concrete conclusion. However, it appears from the present work that Eupatorium and Xanthium biomass species have the highest fixed carbon and lowest volatile matter contents in their stumps than the stumps of the others. As for ash fusion temperature The Sida biomass species has the highest values of IDT, ST, HT and FT (7860- 1490˚C) for its ash, followed by Anisomeles (740-1441˚C). Xanthium and Eupatorium have lower values for IDT, ST, HT and FT (670- 1244˚C) for their ashes. The results have shown that approximately 4, 7, 6 and 2 hectares of land are required to generate 20,000 kWh/day electricity from Sida rhombifolia, Eupatorium coelestinum, Xanthium strumarium and Anisomeles lamiaceae biomass species. Coal samples, obtained from six different local mines, were also examined for their qualities and the results were compared with those of studied biomass materials. This comparison reveals much higher power output with negligible emission of suspended particulate matters (SPM) from biomass materials

ethesis@nitr

A low power selective median filter design

Author: Dalai Radhamadhab
Publication venue
Publication date: 01/01/2008
Field of study

A selective median filter which consumes less power has been designed and different logics for majority bit evaluation has been applied and simulated in VHDL .It is rightly called as selective because an edge pixel detector [2] has been used to select those pixels which are to be processed through median filter. As for median value calculation; sorting of 3 x 3 window’s pixel values has been done using majority bit circuit [4].Different majority bit calculation method has been implemented and the result sorting circuit has been analyzed for power analysis. In this work a general median filter which uses binary sorting method known as Majority Voting Circuit (MVC) has been designed using VHDL and optimized using SYNOPSIS which has used 0.13μm CMOS technology .The digital design of sorting circuit saves approximately 60% of power comprising of cell leakage and dynamic power comparing to a mixed signal design of Floating gate based Majority bit median filter [4]. Before operating median filter on each pixel double derivative filter [2] has been applied to check whether it is an edge pixel or not. Overall this is a digital design of a mixed filter which preserves edges and removes noises as well.Low power techniques at logic level and algorithmic level have been embedded into this work. In our work we have also designed a small microprocessor using VHDL code. Later a memory (for the purpose of image storing) based Control Unit for single median value evaluation has been designed and simulated in XILINX. Here for sorting circuit a common logic based circuit (component) has been put forward. The power, latency or delay, area of whole design has been compared and tested with other designs

ethesis@nitr

Radiation Hardened by Design Methodologies for Soft-Error Mitigated Digital Architectures

Author
Publication venue
Publication date: 01/01/2017
Field of study

abstract: Digital architectures for data encryption, processing, clock synthesis, data transfer, etc. are susceptible to radiation induced soft errors due to charge collection in complementary metal oxide semiconductor (CMOS) integrated circuits (ICs). Radiation hardening by design (RHBD) techniques such as double modular redundancy (DMR) and triple modular redundancy (TMR) are used for error detection and correction respectively in such architectures. Multiple node charge collection (MNCC) causes domain crossing errors (DCE) which can render the redundancy ineffectual. This dissertation describes techniques to ensure DCE mitigation with statistical confidence for various designs. Both sequential and combinatorial logic are separated using these custom and computer aided design (CAD) methodologies. Radiation vulnerability and design overhead are studied on VLSI sub-systems including an advanced encryption standard (AES) which is DCE mitigated using module level coarse separation on a 90-nm process with 99.999% DCE mitigation. A radiation hardened microprocessor (HERMES2) is implemented in both 90-nm and 55-nm technologies with an interleaved separation methodology with 99.99% DCE mitigation while achieving 4.9% increased cell density, 28.5 % reduced routing and 5.6% reduced power dissipation over the module fences implementation. A DMR register-file (RF) is implemented in 55 nm process and used in the HERMES2 microprocessor. The RF array custom design and the decoders APR designed are explored with a focus on design cycle time. Quality of results (QOR) is studied from power, performance, area and reliability (PPAR) perspective to ascertain the improvement over other design techniques. A radiation hardened all-digital multiplying pulsed digital delay line (DDL) is designed for double data rate (DDR2/3) applications for data eye centering during high speed off-chip data transfer. The effect of noise, radiation particle strikes and statistical variation on the designed DDL are studied in detail. The design achieves the best in class 22.4 ps peak-to-peak jitter, 100-850 MHz range at 14 pJ/cycle energy consumption. Vulnerability of the non-hardened design is characterized and portions of the redundant DDL are separated in custom and auto-place and route (APR). Thus, a range of designs for mission critical applications are implemented using methodologies proposed in this work and their potential PPAR benefits explored in detail.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

ASU Digital Repository

Energy-Efficient Digital Circuit Design using Threshold Logic Gates

Author
Publication venue
Publication date: 01/01/2015
Field of study

abstract: Improving energy efficiency has always been the prime objective of the custom and automated digital circuit design techniques. As a result, a multitude of methods to reduce power without sacrificing performance have been proposed. However, as the field of design automation has matured over the last few decades, there have been no new automated design techniques, that can provide considerable improvements in circuit power, leakage and area. Although emerging nano-devices are expected to replace the existing MOSFET devices, they are far from being as mature as semiconductor devices and their full potential and promises are many years away from being practical. The research described in this dissertation consists of four main parts. First is a new circuit architecture of a differential threshold logic flipflop called PNAND. The PNAND gate is an edge-triggered multi-input sequential cell whose next state function is a threshold function of its inputs. Second a new approach, called hybridization, that replaces flipflops and parts of their logic cones with PNAND cells is described. The resulting \hybrid circuit, which consists of conventional logic cells and PNANDs, is shown to have significantly less power consumption, smaller area, less standby power and less power variation. Third, a new architecture of a field programmable array, called field programmable threshold logic array (FPTLA), in which the standard lookup table (LUT) is replaced by a PNAND is described. The FPTLA is shown to have as much as 50% lower energy-delay product compared to conventional FPGA using well known FPGA modeling tool called VPR. Fourth, a novel clock skewing technique that makes use of the completion detection feature of the differential mode flipflops is described. This clock skewing method improves the area and power of the ASIC circuits by increasing slack on timing paths. An additional advantage of this method is the elimination of hold time violation on given short paths. Several circuit design methodologies such as retiming and asynchronous circuit design can use the proposed threshold logic gate effectively. Therefore, the use of threshold logic flipflops in conventional design methodologies opens new avenues of research towards more energy-efficient circuits.Dissertation/ThesisDoctoral Dissertation Computer Science 201

ASU Digital Repository

CMOS circuit speed and power optimization using simplified RC delay model

Author: Lakkakula Sunil Kumar
Publication venue
Publication date: 01/05/2015
Field of study

A simplified RC delay model which is expressed explicitly in terms of transistor widths is presented to easily perform circuit analysis and quickly optimize the transistor widths for delay and power without having to do tedious layout or schematic simulation. A novel heuristic gradient descent method is used to effectively solve the optimization problem. Popular parallel prefix adders such as Brent-Kung, Skylansky and Kogge-Stone adders are modeled using the proposed simplified RC delay model and a power-delay performance comparison is done after optimization to show the ability of the model. The optimization results suggest that the Brent-Kung adder is more efficient in terms of both delay and power by having the lowest power-delay product followed by the Skylansky adder and then the Kogge-Stone adder

SHAREOK repository