128 research outputs found

    Circuits and Systems Advances in Near Threshold Computing

    Get PDF
    Modern society is witnessing a sea change in ubiquitous computing, in which people have embraced computing systems as an indispensable part of day-to-day existence. Computation, storage, and communication abilities of smartphones, for example, have undergone monumental changes over the past decade. However, global emphasis on creating and sustaining green environments is leading to a rapid and ongoing proliferation of edge computing systems and applications. As a broad spectrum of healthcare, home, and transport applications shift to the edge of the network, near-threshold computing (NTC) is emerging as one of the promising low-power computing platforms. An NTC device sets its supply voltage close to its threshold voltage, dramatically reducing the energy consumption. Despite showing substantial promise in terms of energy efficiency, NTC is yet to see widescale commercial adoption. This is because circuits and systems operating with NTC suffer from several problems, including increased sensitivity to process variation, reliability problems, performance degradation, and security vulnerabilities, to name a few. To realize its potential, we need designs, techniques, and solutions to overcome these challenges associated with NTC circuits and systems. The readers of this book will be able to familiarize themselves with recent advances in electronics systems, focusing on near-threshold computing

    Low Voltage Circuit Design Techniques for Cubic-Millimeter Computing.

    Full text link
    Cubic-millimeter computers complete with microprocessors, memories, sensors, radios and power sources are becomingly increasingly viable. Power consumption is one of the last remaining barriers to cubic-millimeter computing and is the subject of this work. In particular, this work focuses on minimizing power consumption in digital circuits using low voltage operation. Chapter 2 includes a general discussion of low voltage circuit behavior, specifically that at subthreshold voltages. In Chapter 3, the implications of transistor scaling on subthreshold circuits are considered. It is shown that the slow scaling of gate oxide relative to the device channel length leads to a 60% reduction in Ion/Ioff between the 90nm and 32nm nodes, which results in sub-optimal static noise margins, delay, and power consumption. It is also shown that simple modifications to gate length and doping can alleviate some of these problems. Three low voltage test-chips are discussed for the remainder of this work. The first test-chip implements the Subliminal Processor (Chapter 4), a sub-200mV 8-bit microprocessor fabricated in a 0.13µm technology. Measurements first show that the Subliminal Processor consumes only 3.5pJ/instruction at Vdd=350mV. Measurements of 20 dies then reveal that proper body biasing can eliminate performance variations and reduce mean energy substantially at low voltage. Finally, measurements are used to explore the effectiveness of body biasing, voltage scaling, and various gate sizing techniques for improving speed. The second test-chip implements the Phoenix Processor (Chapter 5), a low voltage 8-bit microprocessor optimized for minimum power operation in standby mode. The Phoenix Processor was fabricated in a 0.18µm technology in an area of only 915x915µm2. The aggressive standby mode strategy used in the Phoenix Processor is discussed thoroughly. Measurements at Vdd=0.5V show that the test-chip consumes 226nW in active mode and only 35.4pW in standby mode, making an on-chip battery a viable option. Finally, the third test-chip implements a low voltage image sensor (Chapter 6). A 128x128 image sensor array was fabricated in a 0.13µm technology. Test-chip measurements reveal that operation below 0.6V is possible with power consumption of only 1.9µW at 0.6V. Extensive characterization is presented with specific emphasis on noise characteristics and power consumption.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/62233/1/hansons_1.pd

    Solid State Circuits Technologies

    Get PDF
    The evolution of solid-state circuit technology has a long history within a relatively short period of time. This technology has lead to the modern information society that connects us and tools, a large market, and many types of products and applications. The solid-state circuit technology continuously evolves via breakthroughs and improvements every year. This book is devoted to review and present novel approaches for some of the main issues involved in this exciting and vigorous technology. The book is composed of 22 chapters, written by authors coming from 30 different institutions located in 12 different countries throughout the Americas, Asia and Europe. Thus, reflecting the wide international contribution to the book. The broad range of subjects presented in the book offers a general overview of the main issues in modern solid-state circuit technology. Furthermore, the book offers an in depth analysis on specific subjects for specialists. We believe the book is of great scientific and educational value for many readers. I am profoundly indebted to the support provided by all of those involved in the work. First and foremost I would like to acknowledge and thank the authors who worked hard and generously agreed to share their results and knowledge. Second I would like to express my gratitude to the Intech team that invited me to edit the book and give me their full support and a fruitful experience while working together to combine this book

    Understanding and Improving the Latency of DRAM-Based Memory Systems

    Full text link
    Over the past two decades, the storage capacity and access bandwidth of main memory have improved tremendously, by 128x and 20x, respectively. These improvements are mainly due to the continuous technology scaling of DRAM (dynamic random-access memory), which has been used as the physical substrate for main memory. In stark contrast with capacity and bandwidth, DRAM latency has remained almost constant, reducing by only 1.3x in the same time frame. Therefore, long DRAM latency continues to be a critical performance bottleneck in modern systems. Increasing core counts, and the emergence of increasingly more data-intensive and latency-critical applications further stress the importance of providing low-latency memory access. In this dissertation, we identify three main problems that contribute significantly to long latency of DRAM accesses. To address these problems, we present a series of new techniques. Our new techniques significantly improve both system performance and energy efficiency. We also examine the critical relationship between supply voltage and latency in modern DRAM chips and develop new mechanisms that exploit this voltage-latency trade-off to improve energy efficiency. The key conclusion of this dissertation is that augmenting DRAM architecture with simple and low-cost features, and developing a better understanding of manufactured DRAM chips together lead to significant memory latency reduction as well as energy efficiency improvement. We hope and believe that the proposed architectural techniques and the detailed experimental data and observations on real commodity DRAM chips presented in this dissertation will enable development of other new mechanisms to improve the performance, energy efficiency, or reliability of future memory systems.Comment: PhD Dissertatio

    Técnicas de compresión de imágenes hiperespectrales sobre hardware reconfigurable

    Get PDF
    Tesis de la Universidad Complutense de Madrid, Facultad de Informática, leída el 18-12-2020Sensors are nowadays in all aspects of human life. When possible, sensors are used remotely. This is less intrusive, avoids interferces in the measuring process, and more convenient for the scientist. One of the most recurrent concerns in the last decades has been sustainability of the planet, and how the changes it is facing can be monitored. Remote sensing of the earth has seen an explosion in activity, with satellites now being launched on a weekly basis to perform remote analysis of the earth, and planes surveying vast areas for closer analysis...Los sensores aparecen hoy en día en todos los aspectos de nuestra vida. Cuando es posible, de manera remota. Esto es menos intrusivo, evita interferencias en el proceso de medida, y además facilita el trabajo científico. Una de las preocupaciones recurrentes en las últimas décadas ha sido la sotenibilidad del planeta, y cómo menitoirzar los cambios a los que se enfrenta. Los estudios remotos de la tierra han visto un gran crecimiento, con satélites lanzados semanalmente para analizar la superficie, y aviones sobrevolando grades áreas para análisis más precisos...Fac. de InformáticaTRUEunpu

    Energy efficient core designs for upcoming process technologies

    Get PDF
    Energy efficiency has been a first order constraint in the design of micro processors for the last decade. As Moore's law sunsets, new technologies are being actively explored to extend the march in increasing the computational power and efficiency. It is essential for computer architects to understand the opportunities and challenges in utilizing the upcoming process technology trends in order to design the most efficient processors. In this work, we consider three process technology trends and propose core designs that are best suited for each of the technologies. The process technologies are expected to be viable over a span of timelines. We first consider the most popular method currently available to improve the energy efficiency, i.e. by lowering the operating voltage. We make key observations regarding the limiting factors in scaling down the operating voltage for general purpose high performance processors. Later, we propose our novel core design, ScalCore, one that can work in high performance mode at nominal Vdd, and in a very energy-efficient mode at low Vdd. The resulting core design can operate at much lower voltages providing higher parallel performance while consuming lower energy. While lowering Vdd improves the energy efficiency, CMOS devices are fundamentally limited in their low voltage operation. Therefore, we next consider an upcoming device technology -- Tunneling Field-Effect Transistors (TFETs), that is expected to supplement CMOS device technology in the near future. TFETs can attain much higher energy efficiency than CMOS at low voltages. However, their performance saturates at high voltages and, therefore, cannot entirely replace CMOS when high performance is needed. Ideally, we desire a core that is as energy-efficient as TFET and provides as much performance as CMOS. To reach this goal, we characterize the TFET device behavior for core design and judiciously integrate TFET units, CMOS units in a single core. The resulting core, called HetCore, can provide very high energy efficiency while limiting the slowdown when compared to a CMOS core. Finally, we analyze Monolithic 3D (M3D) integration technology that is widely considered to be the only way to integrate more transistors on a chip. We present the first analysis of the architectural implications of using M3D for core design and show how to partition the core across different layers. We also address one of the key challenges in realizing the technology, namely, the top layer performance degradation. We propose a critical path based partitioning for logic stages and asymmetric bit/port partitioning for storage stages. The result is a core that performs nearly as well as a core without any top layer slowdown. When compared to a 2D baseline design, an M3D core not only provides much higher performance, it also reduces the energy consumption at the same time. In summary, this thesis addresses one of the fundamental challenges in computer architecture -- overcoming the fact that CMOS is not scaling anymore. As we increase the computing power on a single chip, our ability to power the entire chip keeps decreasing. This thesis proposes three solutions aimed at solving this problem over different timelines. Across all our solutions, we improve energy efficiency without compromising the performance of the core. As a result, we are able to operate twice as many cores with in the same power budget as regular cores, significantly alleviating the problem of dark silicon

    Power Flow and Optimal Power Flow via Physics-Informed Typed Graph Neural Networks

    Get PDF
    Esta tesis explora las redes neuronales de grafos tipados informadas por la física aplicadas al modelado de redes de transporte de energía eléctrica, concretamente a los problemas de flujo de potencia y flujo de potencia óptimo. Las redes de transporte de energía eléctrica son complejos sistemas interconectados, cruciales para garantizar un suministro estable de electricidad. Para lograr la transición hacia sistemas energéticos asequibles, fiables y sostenibles, la electrificación de diversos sectores económicos y la integración de fuentes de energía renovable en la red de transmisión han aumentado significativamente. La mayoría de las tecnologías para la generación de energía renovable añaden fluctuación e incertidumbre a la generación de electricidad, por lo que, para garantizar un suministro eléctrico eficiente y estable en todo momento, los operadores de la red de transporte deben realizar frecuentes simulaciones de flujo de potencia y de flujo de potencia óptimo para evaluar el estado de la red. Por estas razones es necesario investigar técnicas nuevas, flexibles y más eficientes para resolver estos análisis. Los recientes avances en el aprendizaje automático, y en particular en las redes neuronales artificiales, indican que estos métodos tienen potencial para resolver problemas de análisis de redes eléctricas de forma rápida y fiable. Hasta la fecha, pocos trabajos han intentado aprovechar las capacidades de aprendizaje de las redes neuronales artificiales para abordar estos temas. Sin embargo, la mayoría de los trabajos publicados no resuelven dos grandes retos: en primer lugar, la necesidad de grandes cantidades de datos de entrenamiento y, en segundo lugar, la falta de capacidad de generalización para analizar redes de transporte realistas con topología variable. En esta tesis, se superan estos inconvenientes introduciendo redes neuronales de grafos tipados, que están especializados para procesar datos estructurados en forma de grafos con distintos tipos de elementos. La red de transmisión puede representarse directamente como un grafo, y al asignar distintos tipos de nodos para representar los diferentes elementos de la red de transmisión, se incrementa la precisión y la interpretabilidad del modelo propuesto. El modelo resultante es un modelo de red de transporte adaptable que puede aplicarse a diversos problemas, como las aplicaciones de flujo de potencia y flujo de potencia óptimo que se presentan en esta tesis. El esquema de aprendizaje presentado está informado por la física, de forma que el entrenamiento no está supervisado, sino que incorpora información de las leyes físicas del sistema subyacente en la función de costo. Además, el modelo resultante puede probarse en redes eléctricas con diferentes configuraciones y, en el caso del flujo de potencia, con redes de diferentes tamaños. Se demuestra que el método propuesto, con las aplicaciones consideradas, consigue resultados similares a los obtenidos con un método convencional pero hasta cuatro órdenes de magnitud más rápido, sin necesidad de datos de entrenamiento y con capacidad de generalización a diferentes redes de transporte. Se puede concluir, por tanto, que el trabajo presentado en esta tesis ofrece un método basado en redes neuronales para agilizar la resolución del complejo sistema de ecuaciones no lineales presente en el problema de flujo de potencia, así como el problema de optimización con restricciones presente en el problema de flujo de potencia óptimo. Estos resultados proporcionan un valioso paso hacia el desarrollo de un sistema general para ayudar a los operadores de sistemas de transmisión a optimizar la integración de nuevas tecnologías en la red convencional, y mejorar la fiabilidad y sostenibilidad de los sistemas eléctricos.<br /

    High performance continuous-time filters for information transfer systems

    Get PDF
    Vast attention has been paid to active continuous-time filters over the years. Thus as the cheap, readily available integrated circuit OpAmps replaced their discrete circuit versions, it became feasible to consider active-RC filter circuits using large numbers of OpAmps. Similarly the development of integrated operational transconductance amplifier (OTA) led to new filter configurations. This gave rise to OTA-C filters, using only active devices and capacitors, making it more suitable for integration. The demands on filter circuits have become ever more stringent as the world of electronics and communications has advanced. In addition, the continuing increase in the operating frequencies of modern circuits and systems increases the need for active filters that can perform at these higher frequencies; an area where the LC active filter emerges. What mainly limits the performance of an analog circuit are the non-idealities of the used building blocks and the circuit architecture. This research concentrates on the design issues of high frequency continuous-time integrated filters. Several novel circuit building blocks are introduced. A novel pseudo-differential fully balanced fully symmetric CMOS OTA architecture with inherent common-mode detection is proposed. Through judicious arrangement, the common-mode feedback circuit can be economically implemented. On the level of system architectures, a novel filter low-voltage 4th order RF bandpass filter structure based on emulation of two magnetically coupled resonators is presented. A unique feature of the proposed architecture is using electric coupling to emulate the effect of the coupled-inductors, thus providing bandwidth tuning with small passband ripple. As part of a direct conversion dual-mode 802.11b/Bluetooth receiver, a BiCMOS 5th order low-pass channel selection filter is designed. The filter operated from a single 2.5V supply and achieves a 76dB of out-of-band SFDR. A digital automatic tuning system is also implemented to account for process and temperature variations. As part of a Bluetooth transmitter, a low-power quadrature direct digital frequency synthesizer (DDFS) is presented. Piecewise linear approximation is used to avoid using a ROM look-up table to store the sine values in a conventional DDFS. Significant saving in power consumption, due to the elimination of the ROM, renders the design more suitable for portable wireless communication applications

    Design of Variation-Tolerant Circuits for Nanometer CMOS Technology: Circuits and Architecture Co-Design

    Get PDF
    Aggressive scaling of CMOS technology in sub-90nm nodes has created huge challenges. Variations due to fundamental physical limits, such as random dopants fluctuation (RDF) and line edge roughness (LER) are increasing significantly with technology scaling. In addition, manufacturing tolerances in process technology are not scaling at the same pace as transistor's channel length due to process control limitations (e.g., sub-wavelength lithography). Therefore, within-die process variations worsen with successive technology generations. These variations have a strong impact on the maximum clock frequency and leakage power for any digital circuit, and can also result in functional yield losses in variation-sensitive digital circuits (such as SRAM). Moreover, in nanometer technologies, digital circuits show an increased sensitivity to process variations due to low-voltage operation requirements, which are aggravated by the strong demand for lower power consumption and cost while achieving higher performance and density. It is therefore not surprising that the International Technology Roadmap for Semiconductors (ITRS) lists variability as one of the most challenging obstacles for IC design in nanometer regime. To facilitate variation-tolerant design, we study the impact of random variations on the delay variability of a logic gate and derive simple and scalable statistical models to evaluate delay variations in the presence of within-die variations. This work provides new design insight and highlights the importance of accounting for the effect of input slew on delay variations, especially at lower supply voltages. The derived models are simple, scalable, bias dependent and only require the knowledge of easily measurable parameters. This makes them useful in early design exploration, circuit/architecture optimization as well as technology prediction (especially in low-power and low-voltage operation). The derived models are verified using Monte Carlo SPICE simulations using industrial 90nm technology. Random variations in nanometer technologies are considered one of the largest design considerations. This is especially true for SRAM, due to the large variations in bitcell characteristics. Typically, SRAM bitcells have the smallest device sizes on a chip. Therefore, they show the largest sensitivity to different sources of variations. With the drastic increase in memory densities, lower supply voltages and higher variations, statistical simulation methodologies become imperative to estimate memory yield and optimize performance and power. In this research, we present a methodology for statistical simulation of SRAM read access yield, which is tightly related to SRAM performance and power consumption. The proposed flow accounts for the impact of bitcell read current variation, sense amplifier offset distribution, timing window variation and leakage variation on functional yield. The methodology overcomes the pessimism existing in conventional worst-case design techniques that are used in SRAM design. The proposed statistical yield estimation methodology allows early yield prediction in the design cycle, which can be used to trade off performance and power requirements for SRAM. The methodology is verified using measured silicon yield data from a 1Mb memory fabricated in an industrial 45nm technology. Embedded SRAM dominates modern SoCs and there is a strong demand for SRAM with lower power consumption while achieving high performance and high density. However, in the presence of large process variations, SRAMs are expected to consume larger power to ensure correct read operation and meet yield targets. We propose a new architecture that significantly reduces array switching power for SRAM. The proposed architecture combines built-in self-test (BIST) and digitally controlled delay elements to reduce the wordline pulse width for memories while ensuring correct read operation; hence, reducing switching power. A new statistical simulation flow was developed to evaluate the power savings for the proposed architecture. Monte Carlo simulations using a 1Mb SRAM macro from an industrial 45nm technology was used to examine the power reduction achieved by the system. The proposed architecture can reduce the array switching power significantly and shows large power saving - especially as the chip level memory density increases. For a 48Mb memory density, a 27% reduction in array switching power can be achieved for a read access yield target of 95%. In addition, the proposed system can provide larger power saving as process variations increase, which makes it a very attractive solution for 45nm and below technologies. In addition to its impact on bitcell read current, the increase of local variations in nanometer technologies strongly affect SRAM cell stability. In this research, we propose a novel single supply voltage read assist technique to improve SRAM static noise margin (SNM). The proposed technique allows precharging different parts of the bitlines to VDD and GND and uses charge sharing to precisely control the bitline voltage, which improves the bitcell stability. In addition to improving SNM, the proposed technique also reduces memory access time. Moreover, it only requires one supply voltage, hence, eliminates the need of large area voltage shifters. The proposed technique has been implemented in the design of a 512kb memory fabricated in 45nm technology. Results show improvements in SNM and read operation window which confirms the effectiveness and robustness of this technique

    Characterization and mitigation of process variation in digital circuits and systems

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 155-166).Process variation threatens to negate a whole generation of scaling in advanced process technologies due to performance and power spreads of greater than 30-50%. Mitigating this impact requires a thorough understanding of the variation sources, magnitudes and spatial components at the device, circuit and architectural levels. This thesis explores the impacts of variation at each of these levels and evaluates techniques to alleviate them in the context of digital circuits and systems. At the device level, we propose isolation and measurement of variation in the intrinsic threshold voltage of a MOSFET using sub-threshold leakage currents. Analysis of the measured data, from a test-chip implemented on a 0. 18[mu]m CMOS process, indicates that variation in MOSFET threshold voltage is a truly random process dependent only on device dimensions. Further decomposition of the observed variation reveals no systematic within-die variation components nor any spatial correlation. A second test-chip capable of characterizing spatial variation in digital circuits is developed and implemented in a 90nm triple-well CMOS process. Measured variation results show that the within-die component of variation is small at high voltages but is an increasing fraction of the total variation as power-supply voltage decreases. Once again, the data shows no evidence of within-die spatial correlation and only weak systematic components. Evaluation of adaptive body-biasing and voltage scaling as variation mitigation techniques proves voltage scaling is more effective in performance modification with reduced impact to idle power compared to body-biasing.(cont.) Finally, the addition of power-supply voltages in a massively parallel multicore processor is explored to reduce the energy required to cope with process variation. An analytic optimization framework is developed and analyzed; using a custom simulation methodology, total energy of a hypothetical 1K-core processor based on the RAW core is reduced by 6-16% with the addition of only a single voltage. Analysis of yield versus required energy demonstrates that a combination of disabling poor-performing cores and additional power-supply voltages results in an optimal trade-off between performance and energy.by Nigel Anthony Drego.Ph.D
    corecore