63 research outputs found
New Design Techniques for Dynamic Reconfigurable Architectures
L'abstract è presente nell'allegato / the abstract is in the attachmen
Embedded electronic systems driven by run-time reconfigurable hardware
Abstract
This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen
Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnologÃa hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando asà su implementación fÃsica –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnologÃa a través del prototipado de varias aplicaciones de ingenierÃa (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.Resum
Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinà micament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinà mica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant aixà la seva implementació fÃsica –à rea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware està tic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria
Recommended from our members
MANAGING AND LEVERAGING VARIATIONS AND NOISE IN NANOMETER CMOS
Advanced CMOS technologies have enabled high density designs at the cost of complex fabrication process. Variation in oxide thickness and Random Dopant Fluctuation (RDF) lead to variation in transistor threshold voltage Vth. Current photo-lithography process used for printing decreasing critical dimensions result in variation in transistor channel length and width. A related challenge in nanometer CMOS is that of on-chip random noise. With decreasing threshold voltage and operating voltage; and increasing operating temperature, CMOS devices are more sensitive to random on-chip noise in advanced technologies.
In this thesis, we explore novel circuit techniques to manage the impact of process variation in nanometer CMOS technologies. We also analyze the impact of on-chip noise on CMOS circuits and propose techniques to leverage or manage impact of noise based on the application. True Random Number Generator (TRNG) is an interesting cryptographic primitive that leverages on-chip noise to generate random bits; however, it is highly sensitive to process variation. We explore novel metastability circuits to alleviate the impact of variations and at the same time leverage on-chip noise sources like Random Thermal Noise and Random Telegraph Noise (RTN) to generate high quality random bits. We develop stochastic models for metastability based TRNG circuits to analyze the impact of variation and noise. The stochastic models are used to analyze and compare low power, energy efficient and lightweight post-processing techniques targeted to low power applications like System on Chip (SoC) and RFID. We also propose variation aware circuit calibration techniques to increase reliability. We extended this technique to a more generic application of designing Post-Si Tunable (PST) clock buffers to increase parametric yield in the presence of process variation. Apart from one time variation due to fabrication process, transistors undergo constant change in threshold voltage due to aging/wear-out effects and RTN. Process variation affects conventional sensors and introduces inaccuracies during measurement. We present a lightweight wear-out sensor that is tolerant to process variation and provides a fine grained wear-out sensing. A similar circuit is designed to sense fluctuation in transistor threshold voltage due to RTN. Although thermal noise and RTN are leveraged in applications like TRNG, they affect the stability of sensitive circuits like Static Random Access Memory (SRAM). We analyze the impact of on-chip noise on Bit Error Rate (BER) and post-Si test coverage of SRAM cells
Design of Variation-Tolerant Circuits for Nanometer CMOS Technology: Circuits and Architecture Co-Design
Aggressive scaling of CMOS technology in sub-90nm nodes has created huge challenges. Variations due to fundamental physical limits, such as random dopants fluctuation (RDF) and line edge roughness (LER) are increasing significantly with technology scaling. In addition, manufacturing tolerances in process technology are not scaling at the same pace as transistor's channel length due to process control limitations (e.g., sub-wavelength lithography). Therefore, within-die process variations worsen with successive technology generations. These variations have a strong impact on the maximum clock frequency and leakage power for any digital circuit, and can also result in functional yield losses in variation-sensitive digital circuits (such as SRAM). Moreover, in nanometer technologies, digital circuits show an increased sensitivity to process variations due to low-voltage operation requirements, which are aggravated by the strong demand for lower power consumption and cost while achieving higher performance and density. It is therefore not surprising that the International Technology Roadmap for Semiconductors (ITRS) lists variability as one of the most challenging obstacles for IC design in nanometer regime.
To facilitate variation-tolerant design, we study the impact of random variations on the delay variability of a logic gate and derive simple and scalable statistical models to evaluate delay variations in the presence of within-die variations. This work provides new design insight and highlights the importance of accounting for the effect of input slew on delay variations, especially at lower supply voltages. The derived models are simple, scalable, bias dependent and only require the knowledge of easily measurable parameters. This makes them useful in early design exploration, circuit/architecture optimization as well as technology prediction (especially in low-power and low-voltage operation). The derived models are verified using Monte Carlo SPICE simulations using industrial 90nm technology.
Random variations in nanometer technologies are considered one of the largest design considerations. This is especially true for SRAM, due to the large variations in bitcell characteristics. Typically, SRAM bitcells have the smallest device sizes on a chip. Therefore, they show the largest sensitivity to different sources of variations. With the drastic increase in memory densities, lower supply voltages and higher variations, statistical simulation methodologies become imperative to estimate memory yield and optimize performance and power. In this research, we present a methodology for statistical simulation of SRAM read access yield, which is tightly related to SRAM performance and power consumption. The proposed flow accounts for the impact of bitcell read current variation, sense amplifier offset distribution, timing window variation and leakage variation on functional yield. The methodology overcomes the pessimism existing in conventional worst-case design techniques that are used in SRAM design. The proposed statistical yield estimation methodology allows early yield prediction in the design cycle, which can be used to trade off performance and power requirements for SRAM. The methodology is verified using measured silicon yield data from a 1Mb memory fabricated in an industrial 45nm technology.
Embedded SRAM dominates modern SoCs and there is a strong demand for SRAM with lower power consumption while achieving high performance and high density. However, in the presence of large process variations, SRAMs are expected to consume larger power to ensure correct read operation and meet yield targets. We propose a new architecture that significantly reduces array switching power for SRAM. The proposed architecture combines built-in self-test (BIST) and digitally controlled delay elements to reduce the wordline pulse width for memories while ensuring correct read operation; hence, reducing switching power. A new statistical simulation flow was developed to evaluate the power savings for the proposed architecture. Monte Carlo simulations using a 1Mb SRAM macro from an industrial 45nm technology was used to examine the power reduction achieved by the system. The proposed architecture can reduce the array switching power significantly and shows large power saving - especially as the chip level memory density increases. For a 48Mb memory density, a 27% reduction in array switching power can be achieved for a read access yield target of 95%. In addition, the proposed system can provide larger power saving as process variations increase, which makes it a very attractive solution for 45nm and below technologies.
In addition to its impact on bitcell read current, the increase of local variations in nanometer technologies strongly affect SRAM cell stability. In this research, we propose a novel single supply voltage read assist technique to improve SRAM static noise margin (SNM). The proposed technique allows precharging different parts of the bitlines to VDD and GND and uses charge sharing to precisely control the bitline voltage, which improves the bitcell stability. In addition to improving SNM, the proposed technique also reduces memory access time. Moreover, it only requires one supply voltage, hence, eliminates the need of large area voltage shifters. The proposed technique has been implemented in the design of a 512kb memory fabricated in 45nm technology. Results show improvements in SNM and read operation window which confirms the effectiveness and robustness of this technique
Design of Multi-Gigabit Network Interconnect Elements and Protocols for a Data Acquisition System in Radiation Environments
Modern High Energy Physics experiments (HEP) explore the fundamental nature
of matter in more depth than ever before and thereby benefit greatly from the
advances in the field of communication technology. The huge data volumes
generated by the increasingly precise detector setups pose severe problems for
the Data Acquisition Systems (DAQ), which are used to process and store this
information. In addition, detector setups and their read-out electronics need
to be synchronized precisely to allow a later correlation of experiment events
accurately in time. Moreover, the substantial presence of charged particles from
accelerator-generated beams results in strong ionizing radiation levels, which has
a severe impact on the electronic systems.
This thesis recommends an architecture for unified network protocol IP cores
with custom developed physical interfaces for the use of reliable data acquisition
systems in strong radiation environments. Special configured serial bidirectional
point-to-point interconnects are proposed to realize high speed data transmission,
slow control access, synchronization and global clock distribution on unified links
to reduce costs and to gain compact and efficient read-out setups. Special features
are the developed radiation hardened functional units against single and multiple
bit upsets, and the common interface for statistical error and diagnosis information,
which integrates well into the protocol capabilities and eases the error handling in
large experiment setups. Many innovative designs for several custom FPGA and
ASIC platforms have been implemented and are described in detail. Special focus
is placed on the physical layers and network interface elements from high-speed
serial LVDS interconnects up to 20 Gb/s SSTL links in state-of-the-art process
technology.
The developed IP cores are fully tested by an adapted verification environment for
electronic design automation tools and also by live application. They are available
in a global repository allowing a broad usage within further HEP experiments
Dependable Embedded Systems
This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems
Production accompanying testing of the ATLAS Pixel module
The ATLAS Pixel detector, innermost sub-detector of the ATLAS experiment at LHC, CERN, can be sensibly tested in its entirety the first time after its installation in 2006. Because of the poor accessibility (probably once per year) of the Pixel detector and tight scheduling the replacement of damaged modules after integration as well as during operation will become a highly exposed business. Therefore and to ensure that no affected parts will be used in following production steps, it is necessary that each production step is accompanied by testing the components before assembly and make sure the operativeness afterwards. Probably 300 of about total 2000 semiconductor hybrid pixel detector modules will be build at the Universität Dortmund. Thus a production test setup has been build up and examined before starting serial production. These tests contain the characterization and inspection of the module components and the module itself under different environmental conditions and diverse operating parameters. Once a module is assembled the operativeness is tested with a radioactive source and the long-time stability is assured by a burn-in. A fully electrical characterization is the basis for module selection and sorting for the ATLAS Pixel detector. Additionally the charge collection behavior of irradiated and non irradiated modules has been investigated in the H8 beamline with 180 GeV pions
Physically-Adaptive Computing via Introspection and Self-Optimization in Reconfigurable Systems.
Digital electronic systems typically must compute precise and deterministic results, but in principle have flexibility in how they compute. Despite the potential flexibility, the overriding paradigm for more than 50 years has been based on fixed, non-adaptive inte-grated circuits. This one-size-fits-all approach is rapidly losing effectiveness now that technology is advancing into the nanoscale. Physical variation and uncertainty in com-ponent behavior are emerging as fundamental constraints and leading to increasingly sub-optimal fault rates, power consumption, chip costs, and lifetimes. This dissertation pro-poses methods of physically-adaptive computing (PAC), in which reconfigurable elec-tronic systems sense and learn their own physical parameters and adapt with fine granu-larity in the field, leading to higher reliability and efficiency.
We formulate the PAC problem and provide a conceptual framework built around two major themes: introspection and self-optimization. We investigate how systems can efficiently acquire useful information about their physical state and related parameters, and how systems can feasibly re-implement their designs on-the-fly using the information learned. We study the role not only of self-adaptation—where the above two tasks are performed by an adaptive system itself—but also of assisted adaptation using a remote server or peer.
We introduce low-cost methods for sensing regional variations in a system, including a flexible, ultra-compact sensor that can be embedded in an application and implemented on field-programmable gate arrays (FPGAs). An array of such sensors, with only 1% to-tal overhead, can be employed to gain useful information about circuit delays, voltage noise, and even leakage variations. We present complementary methods of regional self-optimization, such as finding a design alternative that best fits a given system region.
We propose a novel approach to characterizing local, uncorrelated variations. Through in-system emulation of noise, previously hidden variations in transient fault sus-ceptibility are uncovered. Correspondingly, we demonstrate practical methods of self-optimization, such as local re-placement, informed by the introspection data.
Forms of physically-adaptive computing are strongly needed in areas such as com-munications infrastructure, data centers, and space systems. This dissertation contributes practical methods for improving PAC costs and benefits, and promotes a vision of re-sourceful, dependable digital systems at unimaginably-fine physical scales.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78922/1/kzick_1.pd
Towards the development of flexible, reliable, reconfigurable, and high-performance imaging systems
Current FPGAs can implement large systems because of the high density of
reconfigurable logic resources in a single chip. FPGAs are comprehensive devices
that combine flexibility and high performance in the same platform compared to
other platform such as General-Purpose Processors (GPPs) and Application Specific
Integrated Circuits (ASICs). The flexibility of modern FPGAs is further enhanced by
introducing Dynamic Partial Reconfiguration (DPR) feature, which allows for
changing the functionality of part of the system while other parts are functioning.
FPGAs became an important platform for digital image processing applications
because of the aforementioned features. They can fulfil the need of efficient and
flexible platforms that execute imaging tasks efficiently as well as the reliably with
low power, high performance and high flexibility. The use of FPGAs as accelerators
for image processing outperforms most of the current solutions. Current FPGA
solutions can to load part of the imaging application that needs high computational
power on dedicated reconfigurable hardware accelerators while other parts are
working on the traditional solution to increase the system performance. Moreover,
the use of the DPR feature enhances the flexibility of image processing further by
swapping accelerators in and out at run-time. The use of fault mitigation techniques
in FPGAs enables imaging applications to operate in harsh environments following
the fact that FPGAs are sensitive to radiation and extreme conditions.
The aim of this thesis is to present a platform for efficient implementations of
imaging tasks. The research uses FPGAs as the key component of this platform and
uses the concept of DPR to increase the performance, flexibility, to reduce the power
dissipation and to expand the cycle of possible imaging applications. In this context,
it proposes the use of FPGAs to accelerate the Image Processing Pipeline (IPP)
stages, the core part of most imaging devices. The thesis has a number of novel
concepts. The first novel concept is the use of FPGA hardware environment and
DPR feature to increase the parallelism and achieve high flexibility. The concept also
increases the performance and reduces the power consumption and area utilisation.
Based on this concept, the following implementations are presented in this thesis: An
implementation of Adams Hamilton Demosaicing algorithm for camera colour
interpolation, which exploits the FPGA parallelism to outperform other equivalents.
In addition, an implementation of Automatic White Balance (AWB), another IPP
stage that employs DPR feature to prove the mentioned novelty aspects. Another
novel concept in this thesis is presented in chapter 6, which uses DPR feature to
develop a novel flexible imaging system that requires less logic and can be
implemented in small FPGAs. The system can be employed as a template for any
imaging application with no limitation. Moreover, discussed in this thesis is a novel
reliable version of the imaging system that adopts novel techniques including
scrubbing, Built-In Self Test (BIST), and Triple Modular Redundancy (TMR) to
detect and correct errors using the Internal Configuration Access Port (ICAP)
primitive. These techniques exploit the datapath-based nature of the implemented
imaging system to improve the system's overall reliability. The thesis presents a
proposal for integrating the imaging system with the Robust Reliable Reconfigurable
Real-Time Heterogeneous Operating System (R4THOS) to get the best out of the
system. The proposal shows the suitability of the proposed DPR imaging system to
be used as part of the core system of autonomous cars because of its unbounded
flexibility. These novel works are presented in a number of publications as shown in section
1.3 later in this thesis
- …