10 research outputs found

    An embedded energy monitoring circuit for a 128kbit SRAM with body-biased sense-amplifiers

    Get PDF
    Embedded energy monitoring of critical system components can be used to enable better power management by capturing run time system conditions such as temperature and application load. In this work, an energy sensing circuit that provides digitally represented absolute energy per operation of a 128kbit SRAM is presented. Designed in a 65nm low-power CMOS process, SRAMs can operate down to 370 mV. Energy sensing circuit consumes 16.7μW during sensing at 1.2V (only 0.28% of SRAM active power at the same voltage). For improved performance, SRAMs utilize body-biased PMOS input strong-arm type sense amplifiers that can achieve 45% tighter input offset distribution for only ~3.5% of total SRAM area overhead.United States. Defense Advanced Research Projects Agency. The Ubiquitous High Performance Computing Progra

    Self-aware Computing in the Angstrom Processor

    Get PDF
    Addressing the challenges of extreme scale computing requires holistic design of new programming models and systems that support those models. This paper discusses the Angstrom processor, which is designed to support a new Self-aware Computing (SEEC) model. In SEEC, applications explicitly state goals, while other systems components provide actions that the SEEC runtime system can use to meet those goals. Angstrom supports this model by exposing sensors and adaptations that traditionally would be managed independently by hardware. This exposure allows SEEC to coordinate hardware actions with actions specified by other parts of the system, and allows the SEEC runtime system to meet application goals while reducing costs (e.g., power consumption).United States. Defense Advanced Research Projects Agency. The Ubiquitous High Performance Computing Progra

    Mr.Wolf: An Energy-Precision Scalable Parallel Ultra Low Power SoC for IoT Edge Processing

    Get PDF
    This paper presents Mr. Wolf, a parallel ultra-low power (PULP) system on chip (SoC) featuring a hierarchical architecture with a small (12 kgates) microcontroller (MCU) class RISC-V core augmented with an autonomous IO subsystem for efficient data transfer from a wide set of peripherals. The small core can offload compute-intensive kernels to an eight-core floating-point capable of processing engine available on demand. The proposed SoC, implemented in a 40-nm LP CMOS technology, features a 108-mu W fully retentive memory (512 kB). The IO subsystem is capable of transferring up to 1.6 Gbit/s from external devices to the memory in less than 2.5 mW. The eight-core compute cluster achieves a peak performance of 850 million of 32-bit integer multiply and accumulate per second (MMAC/s) and 500 million of 32-bit floating-point multiply and accumulate per second (MFMAC/s) -1 GFlop/s-with an energy efficiency up to 15 MMAC/s/mW and 9 MFMAC/s/mW. These building blocks are supported by aggressive on-chip power conversion and management, enabling energy-proportional heterogeneous computing for always-on IoT end nodes improving performance by several orders of magnitude with respect to traditional single-core MCUs within a power envelope of 153 mW. We demonstrated the capabilities of the proposed SoC on a wide set of near-sensor processing kernels showing that Mr. Wolf can deliver performance up to 16.4 GOp/s with energy efficiency up to 274 MOp/s/mW on real-life applications, paving the way for always-on data analytics on high-bandwidth sensors at the edge of the Internet of Things

    ULTRALOW-POWER, LOW-VOLTAGE DIGITAL CIRCUITS FOR BIOMEDICAL SENSOR NODES

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Reconfigurable Antenna Systems: Platform implementation and low-power matters

    Get PDF
    Antennas are a necessary and often critical component of all wireless systems, of which they share the ever-increasing complexity and the challenges of present and emerging trends. 5G, massive low-orbit satellite architectures (e.g. OneWeb), industry 4.0, Internet of Things (IoT), satcom on-the-move, Advanced Driver Assistance Systems (ADAS) and Autonomous Vehicles, all call for highly flexible systems, and antenna reconfigurability is an enabling part of these advances. The terminal segment is particularly crucial in this sense, encompassing both very compact antennas or low-profile antennas, all with various adaptability/reconfigurability requirements. This thesis work has dealt with hardware implementation issues of Radio Frequency (RF) antenna reconfigurability, and in particular with low-power General Purpose Platforms (GPP); the work has encompassed Software Defined Radio (SDR) implementation, as well as embedded low-power platforms (in particular on STM32 Nucleo family of micro-controller). The hardware-software platform work has been complemented with design and fabrication of reconfigurable antennas in standard technology, and the resulting systems tested. The selected antenna technology was antenna array with continuously steerable beam, controlled by voltage-driven phase shifting circuits. Applications included notably Wireless Sensor Network (WSN) deployed in the Italian scientific mission in Antarctica, in a traffic-monitoring case study (EU H2020 project), and into an innovative Global Navigation Satellite Systems (GNSS) antenna concept (patent application submitted). The SDR implementation focused on a low-cost and low-power Software-defined radio open-source platform with IEEE 802.11 a/g/p wireless communication capability. In a second embodiment, the flexibility of the SDR paradigm has been traded off to avoid the power consumption associated to the relevant operating system. Application field of reconfigurable antenna is, however, not limited to a better management of the energy consumption. The analysis has also been extended to satellites positioning application. A novel beamforming method has presented demonstrating improvements in the quality of signals received from satellites. Regarding those who deal with positioning algorithms, this advancement help improving precision on the estimated position

    Framework for self-aware management of goals and constraints in computing systems

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 165-172).Modern computing systems require applications to balance competing goals, e.g.,high performance and low power or high performance and high precision. Achieving the right balance for a particular application and system places an unrealistic burden on application programmers who must understand the power, performance, and precision implications of a variety of application and system configurations (e.g.,changing algorithms or allocating cores). To address this problem, we propose the Self-aware Computing framework, or SEEC. SEEC automatically and dynamically configures systems and applications to meet goals accurately and efficiently. While other self-aware implementations have been proposed, SEEC is uniquely distinguished by its decoupled approach, which allows application and systems programmers to separately specify goals and configurations, each according to their expertise. SEEC's runtime decision engine observes and configures the system automatically, reducing programmer burden. This general and extensible decision engine employs both control theory and machine learning to reason about previously unseen applications and system configurations while automatically adapting to changes in both application and system behavior. This thesis describes the SEEC framework and evaluates it in several case studies. SEEC is evaluated by implementing its interfaces and runtime system on multiple, modern Linux x86 servers. Applications are then instrumented to emit goals and progress, while system services are instrumented to describe available adaptations. The SEEC runtime decision engine is then evaluated for its ability to meet goals accurately and efficiently. For example, SEEC is shown to meet performance goals with less than 3% average error while bringing average power consumption within 92% of optimal. SEEC is also shown to meet power goals with less than 2% average error while achieving over 96% of optimal performance on average. Additional studies show SEEC reacting to maintain performance in response to unexpected events including fluctuations in application workload and reduction in available resources. These studies demonstrate that SEEC can have a positive impact on real systems by understanding high level goals and adapting to meet those goals online.by Henry Hoffmann.Ph.D

    Low-power and application-specific SRAM design for energy-efficient motion estimation

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 181-189).Video content is expected to account for 70% of total mobile data traffic in 2015. High efficiency video coding, in this context, is crucial for lowering the transmission and storage costs for portable electronics. However, modern video coding standards impose a large hardware complexity. Hence, energy-efficiency of these hardware blocks is becoming more critical than ever before for mobile devices. SRAMs are critical components in almost all SoCs affecting the overall energy-efficiency. This thesis focuses on algorithm and architecture development as well as low-power and application-specific SRAM design targeting motion estimation. First, a motion estimation design is considered for the next generation video standard, HEVC. Hardware cost and coding efficiency trade-offs are quantified and an optimum design choice between hardware complexity and coding efficiency is proposed. Hardware-efficient search algorithm, shared search range across CU engines and pixel pre-fetching algorithms provide 4.3x area, 56x on-chip bandwidth and 151 x off-chip bandwidth reduction. Second, a highly-parallel motion estimation design targeting ultra-low voltage operation and supporting AVC/H.264 and VC-1 standards are considered. Hardware reconfigurability along with frame and macro-block parallel processing are implemented for this engine to maximize hardware sharing between multiple standards and to meet throughput constraints. Third, in the context of low-power SRAMs, a 6T and an 8T SRAM are designed in 28nm and 45nm CMOS technologies targeting low voltage operation. The 6T design achieves operation down to 0.6V and the 8T design achieves operation down to 0.5V providing ~ 2.8x and ~ 4.8x reduction in energy/access respectively. Finally, an application-specific SRAM design targeted for motion estimation is developed. Utilizing the correlation of pixel data to reduce bit-line switching activity, this SRAM achieves up to 1.9x energy savings compared to a similar conventional 8T design. These savings demonstrate that application-specific SRAM design can introduce a new dimension and can be combined with voltage scaling to maximize energy-efficiency.by Mahmut Ersin Sinangil.Ph.D

    Ultra-low-power circuits and systems for wearable and implantable medical devices

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 219-231).Advances in circuits, sensors, and energy storage elements have opened up many new possibilities in the health industry. In the area of wearable devices, the miniaturization of electronics has spurred the rapid development of wearable vital signs, activity, and fitness monitors. Maximizing the time between battery recharge places stringent requirements on power consumption by the device. For implantable devices, the situation is exacerbated by the fact that energy storage capacity is limited by volume constraints, and frequent battery replacement via surgery is undesirable. In this case, the design of energy-efficient circuits and systems becomes even more crucial. This thesis explores the design of energy-efficient circuits and systems for two medical applications. The first half of the thesis focuses on the design and implementation of an ultra-low-power, mixed-signal front-end for a wearable ECG monitor in a 0.18pm CMOS process. A mixed-signal architecture together with analog circuit optimizations enable ultra-low-voltage operation at 0.6V which provides power savings through voltage scaling, and ensures compatibility with state-of-the-art DSPs. The fully-integrated front-end consumes just 2.9[mu]W, which is two orders of magnitude lower than commercially available parts. The second half of this thesis focuses on ultra-low-power system design and energy-efficient neural stimulation for a proof-of-concept fully-implantable cochlear implant. First, implantable acoustic sensing is demonstrated by sensing the motion of a human cadaveric middle ear with a piezoelectric sensor. Second, alternate energy-efficient electrical stimulation waveforms are investigated to reduce neural stimulation power when compared to the conventional rectangular waveform. The energy-optimal waveform is analyzed using a computational nerve fiber model, and validated with in-vivo ECAP recordings in the auditory nerve of two cats and with psychophysical tests in two human cochlear implant users. Preliminary human subject testing shows that charge and energy savings of 20-30% and 15-35% respectively are possible with alternative waveforms. A system-on-chip comprising the sensor interface, reconfigurable sound processor, and arbitrary-waveform neural stimulator is implemented in a 0.18[mu]m high-voltage CMOS process to demonstrate the feasibility of this system. The sensor interface and sound processor consume just 12[mu]W of power, representing just 2% of the overall system power which is dominated by stimulation. As a result, the energy savings from using alternative stimulation waveforms transfer directly to the system.by Marcus Yip.Ph.D

    Microarchitectural Low-Power Design Techniques for Embedded Microprocessors

    Get PDF
    With the omnipresence of embedded processing in all forms of electronics today, there is a strong trend towards wireless, battery-powered, portable embedded systems which have to operate under stringent energy constraints. Consequently, low power consumption and high energy efficiency have emerged as the two key criteria for embedded microprocessor design. In this thesis we present a range of microarchitectural low-power design techniques which enable the increase of performance for embedded microprocessors and/or the reduction of energy consumption, e.g., through voltage scaling. In the context of cryptographic applications, we explore the effectiveness of instruction set extensions (ISEs) for a range of different cryptographic hash functions (SHA-3 candidates) on a 16-bit microcontroller architecture (PIC24). Specifically, we demonstrate the effectiveness of light-weight ISEs based on lookup table integration and microcoded instructions using finite state machines for operand and address generation. On-node processing in autonomous wireless sensor node devices requires deeply embedded cores with extremely low power consumption. To address this need, we present TamaRISC, a custom-designed ISA with a corresponding ultra-low-power microarchitecture implementation. The TamaRISC architecture is employed in conjunction with an ISE and standard cell memories to design a sub-threshold capable processor system targeted at compressed sensing applications. We furthermore employ TamaRISC in a hybrid SIMD/MIMD multi-core architecture targeted at moderate to high processing requirements (> 1 MOPS). A range of different microarchitectural techniques for efficient memory organization are presented. Specifically, we introduce a configurable data memory mapping technique for private and shared access, as well as instruction broadcast together with synchronized code execution based on checkpointing. We then study an inherent suboptimality due to the worst-case design principle in synchronous circuits, and introduce the concept of dynamic timing margins. We show that dynamic timing margins exist in microprocessor circuits, and that these margins are to a large extent state-dependent and that they are correlated to the sequences of instruction types which are executed within the processor pipeline. To perform this analysis we propose a circuit/processor characterization flow and tool called dynamic timing analysis. Moreover, this flow is employed in order to devise a high-level instruction set simulation environment for impact-evaluation of timing errors on application performance. The presented approach improves the state of the art significantly in terms of simulation accuracy through the use of statistical fault injection. The dynamic timing margins in microprocessors are then systematically exploited for throughput improvements or energy reductions via our proposed instruction-based dynamic clock adjustment (DCA) technique. To this end, we introduce a 6-stage 32-bit microprocessor with cycle-by-cycle DCA. Besides a comprehensive design flow and simulation environment for evaluation of the DCA approach, we additionally present a silicon prototype of a DCA-enabled OpenRISC microarchitecture fabricated in 28 nm FD-SOI CMOS. The test chip includes a suitable clock generation unit which allows for cycle-by-cycle DCA over a wide range with fine granularity at frequencies exceeding 1 GHz. Measurement results of speedups and power reductions are provided
    corecore