35 research outputs found
Emerging embedded nonvolatile memory solution for ultra low power microcontroller systems
13301甲第4810号博士(工学)金沢大学博士論文本文Full 以下に掲載および掲載予定:1.IEEE Journal of Solid-State Circuits 27(4) pp.569-573 1992. IEEE. 共著者:M. Hayashikoshi, H. Hidaka, K. Arimoto, K. Fujishima 2.IEEE Transactions on Multi-Scale Computing Systems IEEE. 共著者:M. Hayashikoshi, H. Noda, H. Kawai, Y. Murai, S. Otani, K. Nii, Y. Matsuda, H. Kond
Low power processor architecture and multicore approach for embedded systems
13301甲第4319号博士(工学)金沢大学博士論文本文Full 以下に掲載:1.IEICE Transactions Vol. E98-C(7) pp.544-549 2015. IEICE. 共著者: S. Otani, H. Kondo. /2.Reuse 許可エビデンス送
Miniature high dynamic range time-resolved CMOS SPAD image sensors
Since their integration in complementary metal oxide (CMOS) semiconductor technology in 2003,
single photon avalanche diodes (SPADs) have inspired a new era of low cost high integration
quantum-level image sensors. Their unique feature of discerning single photon detections, their ability
to retain temporal information on every collected photon and their amenability to high speed image
sensor architectures makes them prime candidates for low light and time-resolved applications.
From the biomedical field of fluorescence lifetime imaging microscopy (FLIM) to extreme physical
phenomena such as quantum entanglement, all the way to time of flight (ToF) consumer applications
such as gesture recognition and more recently automotive light detection and ranging (LIDAR), huge
steps in detector and sensor architectures have been made to address the design challenges of pixel
sensitivity and functionality trade-off, scalability and handling of large data rates.
The goal of this research is to explore the hypothesis that given the state of the art CMOS nodes and
fabrication technologies, it is possible to design miniature SPAD image sensors for time-resolved
applications with a small pixel pitch while maintaining both sensitivity and built -in functionality.
Three key approaches are pursued to that purpose: leveraging the innate area reduction of logic gates
and finer design rules of advanced CMOS nodes to balance the pixel’s fill factor and processing
capability, smarter pixel designs with configurable functionality and novel system architectures that
lift the processing burden off the pixel array and mediate data flow.
Two pathfinder SPAD image sensors were designed and fabricated: a 96 × 40 planar front side
illuminated (FSI) sensor with 66% fill factor at 8.25μm pixel pitch in an industrialised 40nm process
and a 128 × 120 3D-stacked backside illuminated (BSI) sensor with 45% fill factor at 7.83μm pixel
pitch. Both designs rely on a digital, configurable, 12-bit ripple counter pixel allowing for time-gated
shot noise limited photon counting. The FSI sensor was operated as a quanta image sensor (QIS)
achieving an extended dynamic range in excess of 100dB, utilising triple exposure windows and in-pixel
data compression which reduces data rates by a factor of 3.75×. The stacked sensor is the first
demonstration of a wafer scale SPAD imaging array with a 1-to-1 hybrid bond connection.
Characterisation results of the detector and sensor performance are presented.
Two other time-resolved 3D-stacked BSI SPAD image sensor architectures are proposed. The first is a
fully integrated 5-wire interface system on chip (SoC), with built-in power management and off-focal
plane data processing and storage for high dynamic range as well as autonomous video rate operation.
Preliminary images and bring-up results of the fabricated 2mm² sensor are shown. The second is a
highly configurable design capable of simultaneous multi-bit oversampled imaging and programmable
region of interest (ROI) time correlated single photon counting (TCSPC) with on-chip histogram
generation. The 6.48μm pitch array has been submitted for fabrication. In-depth design details of both
architectures are discussed
Energy-Efficient Wireless Circuits and Systems for Internet of Things
As the demand of ultra-low power (ULP) systems for internet of thing (IoT) applications has been increasing, large efforts on evolving a new computing class is actively ongoing. The evolution of the new computing class, however, faced challenges due to hard constraints on the RF systems. Significant efforts on reducing power of power-hungry wireless radios have been done. The ULP radios, however, are mostly not standard compliant which poses a challenge to wide spread adoption. Being compliant with the WiFi network protocol can maximize an ULP radio’s potential of utilization, however, this standard demands excessive power consumption of over 10mW, that is hardly compatible with in ULP systems even with heavy duty-cycling. Also, lots of efforts to minimize off-chip components in ULP IoT device have been done, however, still not enough for practical usage without a clean external reference, therefore, this limits scaling on cost and form-factor of the new computer class of IoT applications.
This research is motivated by those challenges on the RF systems, and each work focuses on radio designs for IoT applications in various aspects. First, the research covers several endeavors for relieving energy constraints on RF systems by utilizing existing network protocols that eventually meets both low-active power, and widespread adoption. This includes novel approaches on 802.11 communication with articulate iterations on low-power RF systems. The research presents three prototypes as power-efficient WiFi wake-up receivers, which bridges the gap between industry standard radios and ULP IoT radios. The proposed WiFi wake-up receivers operate with low power consumption and remain compatible with the WiFi protocol by using back-channel communication. Back-channel communication embeds a signal into a WiFi compliant transmission changing the firmware in the access point, or more specifically just the data in the payload of the WiFi packet. With a specific sequence of data in the packet, the transmitter can output a signal that mimics a modulation that is more conducive for ULP receivers, such as OOK and FSK. In this work, low power mixer-first receivers, and the first fully integrated ultra-low voltage receiver are presented, that are compatible with WiFi through back-channel communication. Another main contribution of this work is in relieving the integration challenge of IoT devices by removing the need for external, or off-chip crystals and antennas. This enables a small form-factor on the order of mm3-scale, useful for medical research and ubiquitous sensing applications. A crystal-less small form factor fully integrated 60GHz transceiver with on-chip 12-channel frequency reference, and good peak gain dual-mode on-chip antenna is presented.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162975/1/jaeim_1.pd
Embedded electronic systems driven by run-time reconfigurable hardware
Abstract
This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen
Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnología hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando así su implementación física –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnología a través del prototipado de varias aplicaciones de ingeniería (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.Resum
Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinàmicament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinàmica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant així la seva implementació física –àrea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware estàtic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria
Innovative Techniques for Testing and Diagnosing SoCs
We rely upon the continued functioning of many electronic devices for our everyday welfare,
usually embedding integrated circuits that are becoming even cheaper and smaller
with improved features. Nowadays, microelectronics can integrate a working computer
with CPU, memories, and even GPUs on a single die, namely System-On-Chip (SoC).
SoCs are also employed on automotive safety-critical applications, but need to be tested
thoroughly to comply with reliability standards, in particular the ISO26262 functional
safety for road vehicles.
The goal of this PhD. thesis is to improve SoC reliability by proposing innovative
techniques for testing and diagnosing its internal modules: CPUs, memories, peripherals,
and GPUs. The proposed approaches in the sequence appearing in this thesis are described
as follows:
1. Embedded Memory Diagnosis: Memories are dense and complex circuits which
are susceptible to design and manufacturing errors. Hence, it is important to understand
the fault occurrence in the memory array. In practice, the logical and physical
array representation differs due to an optimized design which adds enhancements to
the device, namely scrambling. This part proposes an accurate memory diagnosis
by showing the efforts of a software tool able to analyze test results, unscramble
the memory array, map failing syndromes to cell locations, elaborate cumulative
analysis, and elaborate a final fault model hypothesis. Several SRAM memory failing
syndromes were analyzed as case studies gathered on an industrial automotive
32-bit SoC developed by STMicroelectronics. The tool displayed defects virtually,
and results were confirmed by real photos taken from a microscope.
2. Functional Test Pattern Generation: The key for a successful test is the pattern applied
to the device. They can be structural or functional; the former usually benefits
from embedded test modules targeting manufacturing errors and is only effective
before shipping the component to the client. The latter, on the other hand, can be
applied during mission minimally impacting on performance but is penalized due
to high generation time. However, functional test patterns may benefit for having
different goals in functional mission mode. Part III of this PhD thesis proposes
three different functional test pattern generation methods for CPU cores embedded
in SoCs, targeting different test purposes, described as follows:
a. Functional Stress Patterns: Are suitable for optimizing functional stress during
I
Operational-life Tests and Burn-in Screening for an optimal device reliability
characterization
b. Functional Power Hungry Patterns: Are suitable for determining functional
peak power for strictly limiting the power of structural patterns during manufacturing
tests, thus reducing premature device over-kill while delivering high test
coverage
c. Software-Based Self-Test Patterns: Combines the potentiality of structural patterns
with functional ones, allowing its execution periodically during mission.
In addition, an external hardware communicating with a devised SBST was proposed.
It helps increasing in 3% the fault coverage by testing critical Hardly
Functionally Testable Faults not covered by conventional SBST patterns.
An automatic functional test pattern generation exploiting an evolutionary algorithm
maximizing metrics related to stress, power, and fault coverage was employed
in the above-mentioned approaches to quickly generate the desired patterns. The
approaches were evaluated on two industrial cases developed by STMicroelectronics;
8051-based and a 32-bit Power Architecture SoCs. Results show that generation
time was reduced upto 75% in comparison to older methodologies while
increasing significantly the desired metrics.
3. Fault Injection in GPGPU: Fault injection mechanisms in semiconductor devices
are suitable for generating structural patterns, testing and activating mitigation techniques,
and validating robust hardware and software applications. GPGPUs are
known for fast parallel computation used in high performance computing and advanced
driver assistance where reliability is the key point. Moreover, GPGPU manufacturers
do not provide design description code due to content secrecy. Therefore,
commercial fault injectors using the GPGPU model is unfeasible, making radiation
tests the only resource available, but are costly. In the last part of this thesis, we
propose a software implemented fault injector able to inject bit-flip in memory elements
of a real GPGPU. It exploits a software debugger tool and combines the
C-CUDA grammar to wisely determine fault spots and apply bit-flip operations in
program variables. The goal is to validate robust parallel algorithms by studying
fault propagation or activating redundancy mechanisms they possibly embed. The
effectiveness of the tool was evaluated on two robust applications: redundant parallel
matrix multiplication and floating point Fast Fourier Transform
An Input Power-Aware Maximum Efficiency Tracking Technique for Energy Harvesting in IoT Applications
The Internet of Things (IoT) enables intelligent monitoring and management in many applications such as industrial and biomedical systems as well as environmental and infrastructure monitoring. As a result, IoT requires billions of wireless sensor network (WSN) nodes equipped with a microcontroller and transceiver. As many of these WSN nodes are off-grid and small-sized, their limited-capacity batteries need periodic replacement. To mitigate the high costs and challenges of these battery replacements, energy harvesting from ambient sources is vital to achieve energy-autonomous operation. Energy harvesting for WSNs is challenging because the available energy varies significantly with ambient conditions and in many applications, energy must be harvested from ultra-low power levels.
To tackle these stringent power constraints, this dissertation proposes a discontinuous charging technique for switched-capacitor converters that improves the power conversion efficiency (PCE) at low input power levels and extends the input power harvesting range at which high PCE is achievable. Discontinuous charging delivers current to energy storage only during clock non-overlap time. This enables tuning of the output current to minimize converter losses based on the available input power. Based on this fundamental result, an input power-aware, two-dimensional efficiency tracking technique for WSNs is presented. In addition to conventional switching frequency control, clock nonoverlap time control is introduced to adaptively optimize the power conversion efficiency according to the sensed ambient power levels.
The proposed technique is designed and simulated in 90nm CMOS with post-layout extraction. Under the same input and output conditions, the proposed system maintains at least 45% PCE at 4μW input power, as opposed to a conventional continuous system which requires at least 18.7μW to maintain the same PCE. In this technique, the input power harvesting range is extended by 1.5x.
The technique is applied to a WSN implementation utilizing the IEEE 802.15.4- compatible GreenNet communications protocol for industrial and wearable applications. This allows the node to meet specifications and achieve energy autonomy when deployed in harsher environments where the input power is 49% lower than what is required for conventional operation
Soft-Error Resilience Framework For Reliable and Energy-Efficient CMOS Logic and Spintronic Memory Architectures
The revolution in chip manufacturing processes spanning five decades has proliferated high performance and energy-efficient nano-electronic devices across all aspects of daily life. In recent years, CMOS technology scaling has realized billions of transistors within large-scale VLSI chips to elevate performance. However, these advancements have also continually augmented the impact of Single-Event Transient (SET) and Single-Event Upset (SEU) occurrences which precipitate a range of Soft-Error (SE) dependability issues. Consequently, soft-error mitigation techniques have become essential to improve systems\u27 reliability. Herein, first, we proposed optimized soft-error resilience designs to improve robustness of sub-micron computing systems. The proposed approaches were developed to deliver energy-efficiency and tolerate double/multiple errors simultaneously while incurring acceptable speed performance degradation compared to the prior work. Secondly, the impact of Process Variation (PV) at the Near-Threshold Voltage (NTV) region on redundancy-based SE-mitigation approaches for High-Performance Computing (HPC) systems was investigated to highlight the approach that can realize favorable attributes, such as reduced critical datapath delay variation and low speed degradation. Finally, recently, spin-based devices have been widely used to design Non-Volatile (NV) elements such as NV latches and flip-flops, which can be leveraged in normally-off computing architectures for Internet-of-Things (IoT) and energy-harvesting-powered applications. Thus, in the last portion of this dissertation, we design and evaluate for soft-error resilience NV-latching circuits that can achieve intriguing features, such as low energy consumption, high computing performance, and superior soft errors tolerance, i.e., concurrently able to tolerate Multiple Node Upset (MNU), to potentially become a mainstream solution for the aerospace and avionic nanoelectronics. Together, these objectives cooperate to increase energy-efficiency and soft errors mitigation resiliency of larger-scale emerging NV latching circuits within iso-energy constraints. In summary, addressing these reliability concerns is paramount to successful deployment of future reliable and energy-efficient CMOS logic and spintronic memory architectures with deeply-scaled devices operating at low-voltages
Energy autonomous systems : future trends in devices, technology, and systems
The rapid evolution of electronic devices since the beginning of the nanoelectronics era has brought about exceptional computational power in an ever shrinking system footprint. This has enabled among others the wealth of nomadic battery powered wireless systems (smart phones, mp3 players, GPS, …) that society currently enjoys. Emerging integration technologies enabling even smaller volumes and the associated increased functional density may bring about a new revolution in systems targeting wearable healthcare, wellness, lifestyle and industrial monitoring applications
The impact of soft errors in logic and its commercialisation in ARM IP
The significance of soft errors in logic has grown because of reduced memory
vulnerability and the shrinking dimensions of semiconductor technology coupled
with the increasing amount of logic integrated into a chip. Consequently, some
of ARM’s customers are concerned about how soft errors on the bus interconnect
will affect the dependability of their systems, since the interconnect is a critical
hub of communication in a SoC and represents a substantial and growing amount
of logic. With the rising complexity of their systems, the interconnect will
become larger and more complex in the future, adding to their concern. In this
work the impact of soft errors on the bus interconnect logic was investigated
and a product was developed to ameliorate the effects of such errors on ARM’s
customers’ products.
Methods to measure the SER of ARM IP were investigated by focusing on
logical masking, which is a component in the calculation of the SER. The effect
that the topology of a combinatorial logic circuit has on its logical masking rate
was considered by performing gate-level statistical fault injection on different
implementations of adder circuits. Significant variation in logical masking was
found ranging from a factor of 3.1 at a synthesis frequency of 100 MHz to a factor
of 2.1 at 900 MHz. This difference is explained in an original way by correlating
logical masking with the circuit’s path length and fan-out. These properties
could be used to create a static method of measuring the logical masking rather
than the current time-consuming method of dynamic simulation. Additionally,
nearly 30% of faults injected cause more than one error, which means that the
combinational SER will be underestimated if research does not take gate fan-out
into consideration. Using this methodology a circuit designer can now base his
choice or development of a circuit on its reliability as well as its performance,
power, and area. Studying the variation in the factors that affect the SER is
important to ensure accuracy in addressing customer requirements.
Although it is important to consider the rate of soft error occurrence, in this
work the impact of errors is demonstrated to be critical. Using protocol-level
fault injection it is shown that faults on the ARM AXI bus interconnect can have
a serious effect on the reliability of the entire SoC such as deadlock, memory
corruption, or undefined behaviour. Using a fault-path traversal algorithm,
it is demonstrated that traditional error detection codes are not sufficient at
preventing these failures when faults occur on certain AXI bus signals. This led
to the development of novel fault tolerant methods that provide protection for
these identified signals. Based on these developments, a product was proposed for
an add-on to the AXI bus interconnect that can detect, correct, and report logic
soft errors without changing the AMBA standard or the customer’s connecting
IP