18 research outputs found

    An Adaptive Modular Redundancy Technique to Self-regulate Availability, Area, and Energy Consumption in Mission-critical Applications

    Get PDF
    As reconfigurable devices\u27 capacities and the complexity of applications that use them increase, the need for self-reliance of deployed systems becomes increasingly prominent. A Sustainable Modular Adaptive Redundancy Technique (SMART) composed of a dual-layered organic system is proposed, analyzed, implemented, and experimentally evaluated. SMART relies upon a variety of self-regulating properties to control availability, energy consumption, and area used, in dynamically-changing environments that require high degree of adaptation. The hardware layer is implemented on a Xilinx Virtex-4 Field Programmable Gate Array (FPGA) to provide self-repair using a novel approach called a Reconfigurable Adaptive Redundancy System (RARS). The software layer supervises the organic activities within the FPGA and extends the self-healing capabilities through application-independent, intrinsic, evolutionary repair techniques to leverage the benefits of dynamic Partial Reconfiguration (PR). A SMART prototype is evaluated using a Sobel edge detection application. This prototype is shown to provide sustainability for stressful occurrences of transient and permanent fault injection procedures while still reducing energy consumption and area requirements. An Organic Genetic Algorithm (OGA) technique is shown capable of consistently repairing hard faults while maintaining correct edge detector outputs, by exploiting spatial redundancy in the reconfigurable hardware. A Monte Carlo driven Continuous Markov Time Chains (CTMC) simulation is conducted to compare SMART\u27s availability to industry-standard Triple Modular Technique (TMR) techniques. Based on nine use cases, parameterized with realistic fault and repair rates acquired from publically available sources, the results indicate that availability is significantly enhanced by the adoption of fast repair techniques targeting aging-related hard-faults. Under harsh environments, SMART is shown to improve system availability from 36.02% with lengthy repair techniques to 98.84% with fast ones. This value increases to five nines (99.9998%) under relatively more favorable conditions. Lastly, SMART is compared to twenty eight standard TMR benchmarks that are generated by the widely-accepted BL-TMR tools. Results show that in seven out of nine use cases, SMART is the recommended technique, with power savings ranging from 22% to 29%, and area savings ranging from 17% to 24%, while still maintaining the same level of availability

    From MARTE to dynamically reconfigurable FPGAs : Introduction of a control extension in a model based design flow

    Get PDF
    System-on-Chip (SoC) can be considered as a particular case of embedded systems and has rapidly became a de-facto solution for implement- ing these complex systems. However, due to the continuous exponential rise in SoC's design complexity, there is a critical need to find new seamless method- ologies and tools to handle the SoC co-design aspects. This paper addresses this issue and proposes a novel SoC co-design methodology based on Model Driven Engineering (MDE) and the MARTE (Modeling and Analysis of Real-Time and Embedded Systems) standard proposed by OMG (Object Management Group), in order to raise the design abstraction levels. Extensions of this standard have enabled us to move from high level specifications to execution platforms such as reconfigurable FPGAs; and allow to implement the notion of Partial Dy- namic Reconfiguration supported by current FPGAs. The overall objective is to carry out system modeling at a high abstraction level expressed in UML (Unified Modeling Language); and afterwards, transform these high level mod- els into detailed enriched lower level models in order to automatically generate the necessary code for final FPGA synthesis

    A Model based design flow for Dynamic Reconfigurable FPGAs

    Get PDF
    International audienceAs System-on-Chip (SoC) based embedded systems have become a de-facto industry standard, their overall design complexity has increased exponentially in recent years, necessitating the introduction of new seamless methodologies and tools to handle the SoC co-design aspects. This paper presents a novel SoC co-design methodology based on Model Driven Engineering and the MARTE (Modeling and Analysis of Real-Time and Embedded Systems) standard, permitting us to raise the abstraction levels and allows to model fine grain reconfigurable architectures such as FPGAs. Extensions of this methodology have enabled us to integrate new features such as Partial Dynamic Reconfiguration supported by Modern FPGAs. The overall objective is to carry out system modeling at a high abstraction level expressed in a graphical language like UML (Unified Modeling Language) and afterwards transformation of these models, automatically generate the necessary code for FPGA synthesis

    Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

    Get PDF
    ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

    Ant colony optimization on runtime reconfigurable architectures

    Get PDF

    The Impact of Single Event Effect Reliability of Convolution Neural Network Architectures and Hardening Approaches Implemented on SRAM FPGA

    Get PDF
    Convolution neural networks (CNNs) have powerful data processing and learning capabilities, which have been widely applied to image processing related applications, especially in autonomous driving, medical image classification, space exploration and military applications. Due to the low power consumption, high flexibility, and parallel characteristics of modern field-programmable gate arrays (FPGAs), they are frequently used in CNN implementation as a hardware acceleration platform. Two architectures are mainly used to implement CNNs on FPGAs: the streaming architecture and single computation engines (SCEs) architecture. In the streaming architecture of a CNN, each layer is implemented with one distinct hardware block and each block can be optimized separately. On the other hand, the single computation engine architecture uses a systolic array of processing elements or a matrix multiplication unit as a computation engine to execute the CNN layers sequentially. The control of the hardware and the scheduling of operations is performed by a control unit and associated software. The advantage of this design paradigm is that it consists of a fixed architectural template that can be scaled based on the input of CNNs and the available FPGA resources. Therefore, it is suitable to implement modern complex CNNs that may not fit into the streaming architecture. SRAM-based FPGAs are sensitive to radiation effects, which can generate single event effects (SEEs) in the system. Designs are required to reduce the radiation effects in FPGA-based CNNs for many applications. Previous radiation effects studies mainly focused on streaming architecture and explored triple-modular redundancy (TMR) or selective hardening techniques. As far as the authors know, there are very few radiation effects studies on the CNNs implemented with SCEs architecture on FPGAs and no radiation effects evaluation between the two architectures with proton irradiation. In this thesis, we implement a Modified National Institute of Standards and Technology (MNIST) CNN with two mainstream architectures, both streaming architecture and SCEs architecture, on a Xilinx Zynq UltraScale+ multiprocessor system on a chip (MPSoC) ZCU-102 evaluation kit. Then we evaluate their error, hang, and total failure rate with proton irradiation test at Tri-University Meson Facility (TRIUMF). The cross-section results for different architectures showed that the SCEs design has higher error cross-sections and total failure cross-sections than that of the streaming architecture, even though SCEs architecture uses much fewer hardware resources in FPGA. In addition, two resilience techniques for SCEs architecture named spatial TMR and temporal TMR are designed and adopted for the SCEs architecture with the same hardware structure and utilization by reusing process elements (PEs) or using multiple PEs to carry out each calculation. As a result, the cross-sections of the spatial TMR and temporal TMR SCEs architecture designs are reduced by 34.9% and 59.2%, with an execution time overhead of 14.2% and 21.4% compared with non-harden one, respectively. Thus, the study shows that SCEs architecture for FPGA acceleration has excellent potential for applications in a radiation environment with minimal overhead due to its scalability and flexibility, and spatial TMR and temporal TMR could effectively reduce the error rate and total failure rate with no extra hardware resources. This suggests that spatial TMR and temporal TMR propose in my project seems to be generic for SCEs architecture, and it could be a better redundancy choice for complex CNNs implement with not enough hardware resources

    Recent Application in Biometrics

    Get PDF
    In the recent years, a number of recognition and authentication systems based on biometric measurements have been proposed. Algorithms and sensors have been developed to acquire and process many different biometric traits. Moreover, the biometric technology is being used in novel ways, with potential commercial and practical implications to our daily activities. The key objective of the book is to provide a collection of comprehensive references on some recent theoretical development as well as novel applications in biometrics. The topics covered in this book reflect well both aspects of development. They include biometric sample quality, privacy preserving and cancellable biometrics, contactless biometrics, novel and unconventional biometrics, and the technical challenges in implementing the technology in portable devices. The book consists of 15 chapters. It is divided into four sections, namely, biometric applications on mobile platforms, cancelable biometrics, biometric encryption, and other applications. The book was reviewed by editors Dr. Jucheng Yang and Dr. Norman Poh. We deeply appreciate the efforts of our guest editors: Dr. Girija Chetty, Dr. Loris Nanni, Dr. Jianjiang Feng, Dr. Dongsun Park and Dr. Sook Yoon, as well as a number of anonymous reviewers

    Applications and Techniques for Fast Machine Learning in Science

    Get PDF
    In this community review report, we discuss applications and techniques for fast machine learning (ML) in science - the concept of integrating powerful ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs

    Self-adaptivity of applications on network on chip multiprocessors: the case of fault-tolerant Kahn process networks

    Get PDF
    Technology scaling accompanied with higher operating frequencies and the ability to integrate more functionality in the same chip has been the driving force behind delivering higher performance computing systems at lower costs. Embedded computing systems, which have been riding the same wave of success, have evolved into complex architectures encompassing a high number of cores interconnected by an on-chip network (usually identified as Multiprocessor System-on-Chip). However these trends are hindered by issues that arise as technology scaling continues towards deep submicron scales. Firstly, growing complexity of these systems and the variability introduced by process technologies make it ever harder to perform a thorough optimization of the system at design time. Secondly, designers are faced with a reliability wall that emerges as age-related degradation reduces the lifetime of transistors, and as the probability of defects escaping post-manufacturing testing is increased. In this thesis, we take on these challenges within the context of streaming applications running in network-on-chip based parallel (not necessarily homogeneous) systems-on-chip that adopt the no-remote memory access model. In particular, this thesis tackles two main problems: (1) fault-aware online task remapping, (2) application-level self-adaptation for quality management. For the former, by viewing fault tolerance as a self-adaptation aspect, we adopt a cross-layer approach that aims at graceful performance degradation by addressing permanent faults in processing elements mostly at system-level, in particular by exploiting redundancy available in multi-core platforms. We propose an optimal solution based on an integer linear programming formulation (suitable for design time adoption) as well as heuristic-based solutions to be used at run-time. We assess the impact of our approach on the lifetime reliability. We propose two recovery schemes based on a checkpoint-and-rollback and a rollforward technique. For the latter, we propose two variants of a monitor-controller- adapter loop that adapts application-level parameters to meet performance goals. We demonstrate not only that fault tolerance and self-adaptivity can be achieved in embedded platforms, but also that it can be done without incurring large overheads. In addressing these problems, we present techniques which have been realized (depending on their characteristics) in the form of a design tool, a run-time library or a hardware core to be added to the basic architecture
    corecore