Search CORE

753 research outputs found

Designing energy-efficient computing systems using equalization and machine learning

Author: Takhirov Zafar
Publication venue
Publication date: 20/02/2018
Field of study

As technology scaling slows down in the nanometer CMOS regime and mobile computing becomes more ubiquitous, designing energy-efficient hardware for mobile systems is becoming increasingly critical and challenging. Although various approaches like near-threshold computing (NTC), aggressive voltage scaling with shadow latches, etc. have been proposed to get the most out of limited battery life, there is still no “silver bullet” to increasing power-performance demands of the mobile systems. Moreover, given that a mobile system could operate in a variety of environmental conditions, like different temperatures, have varying performance requirements, etc., there is a growing need for designing tunable/reconfigurable systems in order to achieve energy-efficient operation. In this work we propose to address the energy- efficiency problem of mobile systems using two different approaches: circuit tunability and distributed adaptive algorithms. Inspired by the communication systems, we developed feedback equalization based digital logic that changes the threshold of its gates based on the input pattern. We showed that feedback equalization in static complementary CMOS logic enabled up to 20% reduction in energy dissipation while maintaining the performance metrics. We also achieved 30% reduction in energy dissipation for pass-transistor digital logic (PTL) with equalization while maintaining performance. In addition, we proposed a mechanism that leverages feedback equalization techniques to achieve near optimal operation of static complementary CMOS logic blocks over the entire voltage range from near threshold supply voltage to nominal supply voltage. Using energy-delay product (EDP) as a metric we analyzed the use of the feedback equalizer as part of various sequential computational blocks. Our analysis shows that for near-threshold voltage operation, when equalization was used, we can improve the operating frequency by up to 30%, while the energy increase was less than 15%, with an overall EDP reduction of ≈10%. We also observe an EDP reduction of close to 5% across entire above-threshold voltage range. On the distributed adaptive algorithm front, we explored energy-efficient hardware implementation of machine learning algorithms. We proposed an adaptive classifier that leverages the wide variability in data complexity to enable energy-efficient data classification operations for mobile systems. Our approach takes advantage of varying classification hardness across data to dynamically allocate resources and improve energy efficiency. On average, our adaptive classifier is ≈100× more energy efficient but has ≈1% higher error rate than a complex radial basis function classifier and is ≈10× less energy efficient but has ≈40% lower error rate than a simple linear classifier across a wide range of classification data sets. We also developed a field of groves (FoG) implementation of random forests (RF) that achieves an accuracy comparable to Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) under tight energy budgets. The FoG architecture takes advantage of the fact that in random forests a small portion of the weak classifiers (decision trees) might be sufficient to achieve high statistical performance. By dividing the random forest into smaller forests (Groves), and conditionally executing the rest of the forest, FoG is able to achieve much higher energy efficiency levels for comparable error rates. We also take advantage of the distributed nature of the FoG to achieve high level of parallelism. Our evaluation shows that at maximum achievable accuracies FoG consumes ≈1.48×, ≈24×, ≈2.5×, and ≈34.7× lower energy per classification compared to conventional RF, SVM-RBF , Multi-Layer Perceptron Network (MLP), and CNN, respectively. FoG is 6.5× less energy efficient than SVM-LR, but achieves 18% higher accuracy on average across all considered datasets

Boston University Institutional Repository (OpenBU)

Study of Radiation Tolerant Storage Cells for Digital Systems

Author: Li Zongru
Publication venue: 'University of Saskatchewan Library'
Publication date: 02/10/2023
Field of study

Single event upsets (SEUs) are a significant reliability issue in semiconductor devices. Fully Depleted Silicon-on-Insulator (FDSOI) technologies have been shown to exhibit better SEU performance compared to bulk technologies. This is attributed to the thin Silicon (Si) layer on top of a Buried Oxide (BOX) layer, which allows each transistor to function as an insulated Si island, thus reducing the threat of charge-sharing. Moreover, the small volume of the Si in FDSOI devices results in a reduction of the amount of charge induced by an ion strike. The effects of Total Ionizing Dose (TID) on integrated circuits (ICs) can lead to changes in gate propagation delays, leakage currents, and device functionality. When IC circuits are exposed to ionizing radiation, positive charges accumulate in the gate oxide and field oxide layers, which results in reduced gate control and increased leakage current. TID effects in bulk technologies are usually simpler due to the presence of only one gate oxide layer, but FDSOI technologies have a more complex response to TID effects because of the additional BOX layer. In this research, we aim to address the challenges of developing cost-effective electronics for space applications by bridging the gap between expensive space-qualified components and high-performance commercial technologies. Key research questions involve exploring various radiation-hardening-by-design (RHBD) techniques and their trade-offs, as well as investigating the feasibility of radiation-hardened microcontrollers. The effectiveness of RHBD techniques in mitigating soft errors is well-established. In our study, a test chip was designed using the 22-nm FDSOI process, incorporating multiple RHBD Flip-Flop (FF) chains alongside a conventional FF chain. Three distinct types of ring oscillators (ROs) and a 256 kbit SRAM was also fabricated in the test chip. To evaluate the SEU and TID performance of these designs, we conducted multiple irradiation experiments with alpha particles, heavy ions, and gamma-rays. Alpha particle irradiation tests were carried out at the University of Saskatchewan using an Americium-241 alpha source. Heavy ion experiments were performed at the Texas A&M University Cyclotron Institute, utilizing Ne, Ar, Cu, and Ag in a 15 MeV/amu cocktail. Lastly, TID experiments were conducted using a Gammacell 220 Co-60 chamber at the University of Saskatchewan. By evaluating the performance of these designs under various irradiation conditions, we strive to advance the development of cost-effective, high-performance electronics suitable for space applications, ultimately demonstrating the significance of this project. When exposed to heavy ions, radiation-hardened FFs demonstrated varying levels of improvement in SEU performance, albeit with added power and timing penalties compared to conventional designs. Stacked-transistor DFF designs showed significant enhancement, while charge-cancelling and interleaving techniques further reduced upsets. Guard-gate (GG) based FF designs provided additional SEU protection, with the DFR-FF and GG-DICE FF designs showing zero upsets under all test conditions. Schmitt-trigger-based DFF designs exhibited improved SEU performance, making them attractive choices for hardening applications. The 22-nm FDSOI process proved more resilient to TID effects than the 28-nm process; however, TID effects remained prominent, with increased leakage current and SRAM block degradation at high doses. These findings offer valuable insights for designers aiming to meet performance and SER specifications for circuits in radiation environments, emphasizing the need for additional attention during the design phase for complex radiation-hardened circuits

University of Saskatchewan Research Archive

Applications of memristors in conventional analogue electronics

Author: Berdan Radu
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/12/2016
Field of study

This dissertation presents the steps employed to activate and utilise analogue memristive devices in conventional analogue circuits and beyond. TiO2 memristors are mainly utilised in this study, and their large variability in operation in between similar devices is identified. A specialised memristor characterisation instrument is designed and built to mitigate this issue and to allow access to large numbers of devices at a time. Its performance is quantified against linear resistors, crossbars of linear resistors, stand-alone memristive elements and crossbars of memristors. This platform allows for a wide range of different pulsing algorithms to be applied on individual devices, or on crossbars of memristive elements, and is used throughout this dissertation. Different ways of achieving analogue resistive switching from any device state are presented. Results of these are used to devise a state-of-art biasing parameter finder which automatically extracts pulsing parameters that induce repeatable analogue resistive switching. IV measurements taken during analogue resistive switching are then utilised to model the internal atomic structure of two devices, via fittings by the Simmons tunnelling barrier model. These reveal that voltage pulses modulate a nano-tunnelling gap along a conical shape. Further retention measurements are performed which reveal that under certain conditions, TiO2 memristors become volatile at short time scales. This volatile behaviour is then implemented into a novel SPICE volatile memristor model. These characterisation methods of solid-state devices allowed for inclusion of TiO2 memristors in practical electronic circuits. Firstly, in the context of large analogue resistive crossbars, a crosspoint reading method is analysed and improved via a 3-step technique. Its scaling performance is then quantified via SPICE simulations. Next, the observed volatile dynamics of memristors are exploited in two separate sequence detectors, with applications in neuromorphic engineering. Finally, the memristor as a programmable resistive weight is exploited to synthesise a memristive programmable gain amplifier and a practical memristive automatic gain control circuit.Open Acces

Spiral - Imperial College Digital Repository

Designing energy-efficient sub-threshold logic circuits using equalization and non-volatile memory circuits using memristors

Author: Zangeneh Mahmoud
Publication venue
Publication date: 08/04/2016
Field of study

The very large scale integration (VLSI) community has utilized aggressive complementary metal-oxide semiconductor (CMOS) technology scaling to meet the ever-increasing performance requirements of computing systems. However, as we enter the nanoscale regime, the prevalent process variation effects degrade the CMOS device reliability. Hence, it is increasingly essential to explore emerging technologies which are compatible with the conventional CMOS process for designing highly-dense memory/logic circuits. Memristor technology is being explored as a potential candidate in designing non-volatile memory arrays and logic circuits with high density, low latency and small energy consumption. In this thesis, we present the detailed functionality of multi-bit 1-Transistor 1-memRistor (1T1R) cell-based memory arrays. We present the performance and energy models for an individual 1T1R memory cell and the memory array as a whole. We have considered TiO2- and HfOx-based memristors, and for these technologies there is a sub-10% difference between energy and performance computed using our models and HSPICE simulations. Using a performance-driven design approach, the energy-optimized TiO2-based RRAM array consumes the least write energy (4.06 pJ/bit) and read energy (188 fJ/bit) when storing 3 bits/cell for 100 nsec write and 1 nsec read access times. Similarly, HfOx-based RRAM array consumes the least write energy (365 fJ/bit) and read energy (173 fJ/bit) when storing 3 bits/cell for 1 nsec write and 200 nsec read access times. On the logic side, we investigate the use of equalization techniques to improve the energy efficiency of digital sequential logic circuits in sub-threshold regime. We first propose the use of a variable threshold feedback equalizer circuit with combinational logic blocks to mitigate the timing errors in digital logic designed in sub-threshold regime. This mitigation of timing errors can be leveraged to reduce the dominant leakage energy by scaling supply voltage or decreasing the propagation delay. At the fixed supply voltage, we can decrease the propagation delay of the critical path in a combinational logic block using equalizer circuits and, correspondingly decrease the leakage energy consumption. For a 8-bit carry lookahead adder designed in UMC 130 nm process, the operating frequency can be increased by 22.87% (on average), while reducing the leakage energy by 22.6% (on average) in the sub-threshold regime. Overall, the feedback equalization technique provides up to 35.4% lower energy-delay product compared to the conventional non-equalized logic. We also propose a tunable adaptive feedback equalizer circuit that can be used with sequential digital logic to mitigate the process variation effects and reduce the dominant leakage energy component in sub-threshold digital logic circuits. For a 64-bit adder designed in 130 nm our proposed approach can reduce the normalized delay variation of the critical path delay from 16.1% to 11.4% while reducing the energy-delay product by 25.83% at minimum energy supply voltage. In addition, we present detailed energy-performance models of the adaptive feedback equalizer circuit. This work serves as a foundation for the design of robust, energy-efficient digital logic circuits in sub-threshold regime

Boston University Institutional Repository (OpenBU)

Impacto da variabilidade PVT em somadores construídos com XORs

Author: Silva Fabio Gustavo Rossato Gomes da
Publication venue
Publication date: 01/01/2020
Field of study

A operação de soma é a mais usada em Unidades Lógicas e Aritméticas (ULA). A ULA é a unidade mais importante no processamento de dados. Em sistemas digitais, é desejado um somador completo com baixo consumo de energia e um alto desempenho. O somador completo faz parte do caminho crítico em sistemas computacionais, ele pode ser implementado de diversas maneiras, a maioria delas tendo como seu principal sub-circuito a porta lógica OU-exclusivo (XOR). Consequentemente, o estudo de somadores completos compostos por combinações de portas lógicas XOR é de grande valia para pesquisas na literatura. Melhorias nos módulos aritméticos pode reduzir significativamente o consumo de potência dos sistemas, mas em tecnologias nanométricas é necessário considerar o impacto da variabilidade. Esse trabalho tem como objetivo analisar projetos de somadores completos que quando submetidos aos efeitos de variabilidade devem ser robustos, ter um bom desempenho e mostrar bons resultados em consumo de energia, quando estão operando em tensão nominal e em tensão de quase limiar. Além disso, foi utilizada uma técnica chamada de célula de desacoplamento (Dcell) visando uma alternativa para a redução da variabilidade de processo. Esse trabalho analisa e compara 4 somadores tradicionais e 9 somadores completos construídos através de 3 blocos lógicos, dos quais 2 deles são substituídos por portas lógicas XOR, em uma tecnologia FinFET de 7nm. Foi observado que circuitos somadores que foram construídos usando a XOR da família lógica CMOS, especialmente no segundo bloco, obtiveram piores resultados de desempenho e consumo energético. Somadores operando em tensão nominal são cerca de 80% mais robustos quanto ao impacto da variabilidade de processo no consumo máximo. A operação em quase limiar implica em uma alta sensibilidade no desempenho e consumo, alcançando mais de 300% nos piores casos. Em relação à variabilidade de processo, foi verificado um aumento de sensibilidade de cerca de 40% no desempenho quando foram utilizadas a XOR V5 e a XOR V8 no segundo bloco dos somadores quando operando em tensão nominal. Para a operação em tensão de quase limiar o uso da metodologia proposta nesse trabalho mostrou ser uma boa opção para alcançar uma maior robustez quanto ao consumo dos circuitos. Considerando o uso da Dcell, na operação em tensão nominal, foi verificado uma redução no desempenho juntamente com uma redução na variabilidade. O melhor caso foi o somador FAV5V8 que para um aumento de 20% no atraso, obteve uma redução de 20% na variabilidade. Em relação ao consumo, houve uma redução de 16% na potência dinâmica, juntamente com uma redução de quase 30% na variabilidade, como o que ocorreu com o somador FAV8V1. Foi possível observar casos de redução da variabilidade em mais de 40% com um pequeno aumento no consumo dinâmico. O uso dessa técnica teve um alto impacto nos resultados de circuitos que operavam em tensão de quase limiar, chegando em alguns casos a mais de 40% de redução do desempenho para uma pequena redução na variabilidade. Quanto ao consumo, nesse caso, os somadores tradicionais foram os menos afetados, e novamente o uso da XOR V8 no segundo bloco para construção dos somadores mostrou ser uma boa opção para aumento da robustez dos circuitos.The sum operation is the most used in the Arithmetic and Logic Units (ALU). In digital systems, a complete adder with low energy consumption and high performance is desired. The full adder is part of the critical path in computer systems. It can be implemented in several ways, most of them having the OR-exclusive logic gate (XOR) as its main sub-circuit. Consequently, the study of full adders composed of combinations of XOR logic gates has a great value in the literature. Improvements in arithmetic modules can significantly reduce the power consumption of systems, however, in nanometric technologies it is necessary to consider the impact of variability. This work aims to analyse designs of full adders considering variability effects, comparing performance and energy consumption when operating at nominal voltage and also at near threshold voltage. In addition, a technique called decoupling cell (Dcell) was used to provide an alternative for reducing process variability. This work analyses and compares four traditional adders and nine adders built using three logic blocks, where two of them are replaced by XOR logic gates, in a 7nm FinFET technology. It was observed that full adders that were built using the XOR of the CMOS logic family, especially in the second block, had worse results in performance and energy consumption. Full adders operating at nominal voltage regime are about 80% more robust in terms of the impact of process variability on maximum consumption. The near threshold operation implies a high sensitivity in performance and consumption, reaching more than 300% in the worst cases. Regarding the process variability, there was an increase in sensitivity of about 40% in performance when the XOR V5 and XOR V8 were used in the second block of the adder when operating at nominal voltage. For the voltage operation of near threshold, the use of the methodology proposed in this work demonstrate to be a good option to achieve greater robustness regarding the consumption of the circuits. Considering the use of Dcell, in the operation at nominal voltage, a reduction in performance was verified together with a reduction in variability. The best case was the adder FAV5V8 which for a 20% increase in delay, obtained a reduction of 20% in variability. In relation to dynamic consumption, there was a 16% reduction in power, together with a reduction of almost 30% in variability, as occurred with the FAV8V1 adder. It was possible to observe cases of reduced variability by more than 40% with a small increase in dynamic consumption. The use of this technique had a high impact on the results of circuits operating at near threshold voltage, in some cases reaching more than 40% reduction in performance for a small reduction in variability. For consumption, in this case, the traditional full adders were the least affected, and again the use of the XOR V8 in the second block for the construction of the adder proved to be a good option for increasing the robustness of the circuits

Lume 5.8

Unreliable Silicon: Circuit through System-Level Techniques for Mitigating the Adverse Effects of Process Variation, Device Degradation and Environmental Conditions.

Author: Karl Eric Alexander
Publication venue
Publication date
Field of study

Designing and manufacturing integrated circuits in advanced, highly-scaled processing technologies that meet stringent specification sets is an increasingly unreliable proposition. Dimensional processing variations, time and stress dependent device degradation and potentially varying environmental conditions exacerbate deviations in performance, power and even functionality of integrated circuits. This work explores a system-level adaptive design philosophy intended to mitigate the power and performance impact of unreliable silicon devices and presents enabling circuits for SRAM variation mitigation and in-situ measurement of device degradation in 130nm and 45nm processing technologies. An adaptation of RAZOR-based DVS designed for on-chip memory power reduction and reliability lifetime improvement enables the elimination of 250 mV of voltage margin in a 1.8V design, with up to 500 mV of reduction when allowing 5% of memory operations to use multiple cycles. A novel PID-controlled dynamic reliability management (DRM) system is presented, allowing user-specified circuit lifetime to be dynamically managed via dynamic voltage and frequency scaling. Peak performance improvement of 20-35% is achievable in typical processing systems by allowing brief periods of elevated voltage operation through the real-time DRM system, while minimizing voltage during non-critical periods of operation to maximize circuit lifetime. A probabilistic analysis of oxide breakdown using the percolation model indicates the need for 1000-2000 integrated in-situ sensors to achieve oxide lifetime prediction error at or under 10%. The conclusions from the oxide analysis are used to guide the design of a series of novel on-chip reliability monitoring circuits for use in a real-time DRM system. A 130nm in-situ oxide breakdown measurement sensor presented is the first published design of an oxide-breakdown oriented circuit and is compatible with standard-cell style automatic “place and route” design styles used in the majority of application specific integrated circuit designs. Measured results show increases in gate oxide leakage of 14-35% after accelerated stress testing. A second generation design of the on-chip oxide degradation sensor is presented that reduces stress mode power consumption by 111,785X over the initial design while providing an ideal 1:1 mapping of gate leakage to output frequency in extracted simulations.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/60701/1/ekarl_1.pd

Deep Blue Documents at the University of Michigan

A fault-tolerant last level cache for CMPs operating at ultra-low voltage

Author: Alastruey Jesús
Ferrerón Alexandra
Ibáñez Marín Pablo Enrique
Monreal Arnal Teresa
Suárez Gracía Dario
Viñals Yúfera Víctor
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Voltage scaling to values near the threshold voltage is a promising technique to hold off the many-core power wall. However, as voltage decreases, some SRAM cells are unable to operate reliably and show a behavior consistent with a hard fault. Block disabling is a micro-architectural technique that allows low-voltage operation by deactivating faulty cache entries, at the expense of reducing the effective cache capacity. In the case of the last-level cache, this capacity reduction leads to an increase in off-chip memory accesses, diminishing the overall energy benefit of reducing the voltage supply. In this work, we exploit the reuse locality and the intrinsic redundancy of multi-level inclusive hierarchies to enhance the performance of block disabling with negligible cost. The proposed fault-aware last-level cache management policy maps critical blocks, those not present in private caches and with a higher probability of being reused, to active cache entries. Our evaluation shows that this fault-aware management results in up to 37.3% and 54.2% fewer misses per kilo instruction (MPKI) than block disabling for multiprogrammed and parallel workloads, respectively. This translates to performance enhancements of up to 13% and 34.6% for multiprogrammed and parallel workloads, respectively.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Repositorio Universidad de Zaragoza

Special Topics in Information Technology

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/11/2022
Field of study

This open access book presents outstanding doctoral dissertations in Information Technology from the Department of Electronics, Information and Bioengineering, Politecnico di Milano, Italy. Information Technology has always been highly interdisciplinary, as many aspects have to be considered in IT systems. The doctoral studies program in IT at Politecnico di Milano emphasizes this interdisciplinary nature, which is becoming more and more important in recent technological advances, in collaborative projects, and in the education of young researchers. Accordingly, the focus of advanced research is on pursuing a rigorous approach to specific research topics starting from a broad background in various areas of Information Technology, especially Computer Science and Engineering, Electronics, Systems and Control, and Telecommunications. Each year, more than 50 PhDs graduate from the program. This book gathers the outcomes of the best theses defended in 2021-22 and selected for the IT PhD Award. Each of the authors provides a chapter summarizing his/her findings, including an introduction, description of methods, main achievements and future work on the topic. Hence, the book provides a cutting-edge overview of the latest research trends in Information Technology at Politecnico di Milano, presented in an easy-to-read format that will also appeal to non-specialists

Directory of Open Access Books (DOAB)

Climate Change, Security Risks, and Violent Conflicts

Author
Publication venue: 'Staats- und Universitatsbibliothek Hamburg Carl von Ossietzky'
Publication date: 02/08/2022
Field of study

Research on security-related aspects of climate change is an important element of climate change impact assessments. Hamburg has become a globally recognized center of pertinent analysis of the climate-conflict-nexus. The essays in this collection present a sample of the research conducted from 2009 to 2018 within an interdisciplinary cooperation of experts from Universität Hamburg and other institutions in Hamburg related to the research group “Climate Change and Security” (CLISEC). This collection of critical assessments covers a broad understanding of security, ranging from the question of climate change as a cause of violent conflict to conditions of human security in the Anthropocene. The in-depth analyses utilize a wide array of methodological approaches, from agent-based modeling to discourse analysis

Directory of Open Access Books (DOAB)

ULTRA ENERGY-EFFICIENT SUB-/NEAR-THRESHOLD COMPUTING: PLATFORM AND METHODOLOGY

Author: ZHAO WENFENG
Publication venue
Publication date: 03/01/2014
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS