347 research outputs found

    Analysis and Test of the Effects of Single Event Upsets Affecting the Configuration Memory of SRAM-based FPGAs

    Get PDF
    SRAM-based FPGAs are increasingly relevant in a growing number of safety-critical application fields, ranging from automotive to aerospace. These application fields are characterized by a harsh radiation environment that can cause the occurrence of Single Event Upsets (SEUs) in digital devices. These faults have particularly adverse effects on SRAM-based FPGA systems because not only can they temporarily affect the behaviour of the system by changing the contents of flip-flops or memories, but they can also permanently change the functionality implemented by the system itself, by changing the content of the configuration memory. Designing safety-critical applications requires accurate methodologies to evaluate the system’s sensitivity to SEUs as early as possible during the design process. Moreover it is necessary to detect the occurrence of SEUs during the system life-time. To this purpose test patterns should be generated during the design process, and then applied to the inputs of the system during its operation. In this thesis we propose a set of software tools that could be used by designers of SRAM-based FPGA safety-critical applications to assess the sensitivity to SEUs of the system and to generate test patterns for in-service testing. The main feature of these tools is that they implement a model of SEUs affecting the configuration bits controlling the logic and routing resources of an FPGA device that has been demonstrated to be much more accurate than the classical stuck-at and open/short models, that are commonly used in the analysis of faults in digital devices. By keeping this accurate fault model into account, the proposed tools are more accurate than similar academic and commercial tools today available for the analysis of faults in digital circuits, that do not take into account the features of the FPGA technology.. In particular three tools have been designed and developed: (i) ASSESS: Accurate Simulator of SEuS affecting the configuration memory of SRAM-based FPGAs, a simulator of SEUs affecting the configuration memory of an SRAM-based FPGA system for the early assessment of the sensitivity to SEUs; (ii) UA2TPG: Untestability Analyzer and Automatic Test Pattern Generator for SEUs Affecting the Configuration Memory of SRAM-based FPGAs, a static analysis tool for the identification of the untestable SEUs and for the automatic generation of test patterns for in-service testing of the 100% of the testable SEUs; and (iii) GABES: Genetic Algorithm Based Environment for SEU Testing in SRAM-FPGAs, a Genetic Algorithm-based Environment for the generation of an optimized set of test patterns for in-service testing of SEUs. The proposed tools have been applied to some circuits from the ITC’99 benchmark. The results obtained from these experiments have been compared with results obtained by similar experiments in which we considered the stuck-at fault model, instead of the more accurate model for SEUs. From the comparison of these experiments we have been able to verify that the proposed software tools are actually more accurate than similar tools today available. In particular the comparison between results obtained using ASSESS with those obtained by fault injection has shown that the proposed fault simulator has an average error of 0:1% and a maximum error of 0:5%, while using a stuck-at fault simulator the average error with respect of the fault injection experiment has been 15:1% with a maximum error of 56:2%. Similarly the comparison between the results obtained using UA2TPG for the accurate SEU model, with the results obtained for stuck-at faults has shown an average difference of untestability of 7:9% with a maximum of 37:4%. Finally the comparison between fault coverages obtained by test patterns generated for the accurate model of SEUs and the fault coverages obtained by test pattern designed for stuck-at faults, shows that the former detect the 100% of the testable faults, while the latter reach an average fault coverage of 78:9%, with a minimum of 54% and a maximum of 93:16%

    Intrinsically Evolvable Artificial Neural Networks

    Get PDF
    Dedicated hardware implementations of neural networks promise to provide faster, lower power operation when compared to software implementations executing on processors. Unfortunately, most custom hardware implementations do not support intrinsic training of these networks on-chip. The training is typically done using offline software simulations and the obtained network is synthesized and targeted to the hardware offline. The FPGA design presented here facilitates on-chip intrinsic training of artificial neural networks. Block-based neural networks (BbNN), the type of artificial neural networks implemented here, are grid-based networks neuron blocks. These networks are trained using genetic algorithms to simultaneously optimize the network structure and the internal synaptic parameters. The design supports online structure and parameter updates, and is an intrinsically evolvable BbNN platform supporting functional-level hardware evolution. Functional-level evolvable hardware (EHW) uses evolutionary algorithms to evolve interconnections and internal parameters of functional modules in reconfigurable computing systems such as FPGAs. Functional modules can be any hardware modules such as multipliers, adders, and trigonometric functions. In the implementation presented, the functional module is a neuron block. The designed platform is suitable for applications in dynamic environments, and can be adapted and retrained online. The online training capability has been demonstrated using a case study. A performance characterization model for RC implementations of BbNNs has also been presented

    FPGA acceleration of sequence analysis tools in bioinformatics

    Full text link
    Thesis (Ph.D.)--Boston UniversityWith advances in biotechnology and computing power, biological data are being produced at an exceptional rate. The purpose of this study is to analyze the application of FPGAs to accelerate high impact production biosequence analysis tools. Compared with other alternatives, FPGAs offer huge compute power, lower power consumption, and reasonable flexibility. BLAST has become the de facto standard in bioinformatic approximate string matching and so its acceleration is of fundamental importance. It is a complex highly-optimized system, consisting of tens of thousands of lines of code and a large number of heuristics. Our idea is to emulate the main phases of its algorithm on FPGA. Utilizing our FPGA engine, we quickly reduce the size of the database to a small fraction, and then use the original code to process the query. Using a standard FPGA-based system, we achieved 12x speedup over a highly optimized multithread reference code. Multiple Sequence Alignment (MSA)--the extension of pairwise Sequence Alignment to multiple Sequences--is critical to solve many biological problems. Previous attempts to accelerate Clustal-W, the most commonly used MSA code, have directly mapped a portion of the code to the FPGA. We use a new approach: we apply prefiltering of the kind commonly used in BLAST to perform the initial all-pairs alignments. This results in a speedup of from 8Ox to 190x over the CPU code (8 cores). The quality is comparable to the original according to a commonly used benchmark suite evaluated with respect to multiple distance metrics. The challenge in FPGA-based acceleration is finding a suitable application mapping. Unfortunately many software heuristics do not fall into this category and so other methods must be applied. One is restructuring: an entirely new algorithm is applied. Another is to analyze application utilization and develop accuracy/performance tradeoffs. Using our prefiltering approach and novel FPGA programming models we have achieved significant speedup over reference programs. We have applied approximation, seeding, and filtering to this end. The bulk of this study is to introduce the pros and cons of these acceleration models for biosequence analysis tools

    UA2TPG: An untestability analyzer and test pattern generator for SEUs in the configuration memory of SRAM-based FPGAs

    Get PDF
    This paper presents UA2TPG, a static analysis tool for the untestability proof and automatic test pattern generation for SEUs in the configuration memory of SRAM-based FPGA systems. The tool is based on the model-checking verification technique. An accurate fault model for both logic components and routing structures is adopted. Experimental results show that many circuits have a significant number of untestable faults, and their detection enables more efficient test pattern generation and on-line testing. The tool is mainly intended to support on-line testing of critical components in FPGA fault-tolerant systems

    Computer vision algorithms on reconfigurable logic arrays

    Full text link

    A genetic parallel programming based logic circuit synthesizer.

    Get PDF
    Lau, Wai Shing.Thesis submitted in: November 2006.Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.Includes bibliographical references (leaves 85-94).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Field Programmable Gate Arrays --- p.2Chapter 1.2 --- FPGA technology mapping problem --- p.3Chapter 1.3 --- Motivations --- p.5Chapter 1.4 --- Contributions --- p.6Chapter 1.5 --- Thesis Organization --- p.9Chapter 2 --- Background Study --- p.11Chapter 2.1 --- Deterministic approach to technology mapping problem --- p.11Chapter 2.1.1 --- FlowMap --- p.12Chapter 2.1.2 --- DAOMap --- p.14Chapter 2.2 --- Stochastic approach --- p.15Chapter 2.2.1 --- Bio-Inspired Methods for Multi-Level Combinational Logic Circuit Design --- p.15Chapter 2.2.2 --- A Survey of Combinational Logic Circuit Representations in stochastic algorithms --- p.17Chapter 2.3 --- Genetic Parallel Programming --- p.20Chapter 2.3.1 --- Accelerating Phenomenon --- p.22Chapter 2.4 --- Chapter Summary --- p.23Chapter 3 --- A GPP based Logic Circuit Synthesizer --- p.24Chapter 3.1 --- Overall system architecture --- p.25Chapter 3.2 --- Multi-Logic-Unit Processor --- p.26Chapter 3.3 --- The Genotype of a MLP program --- p.28Chapter 3.4 --- The Phenotype of a MLP program --- p.31Chapter 3.5 --- The Evolution Engine --- p.33Chapter 3.5.1 --- The Dual-Phase Approach --- p.33Chapter 3.5.2 --- Genetic operators --- p.35Chapter 3.6 --- Chapter Summary --- p.38Chapter 4 --- MLP in hardware --- p.39Chapter 4.1 --- Motivation --- p.39Chapter 4.2 --- Hardware Design and Implementation --- p.40Chapter 4.3 --- Experimental Settings --- p.43Chapter 4.4 --- Experimental Results and Evaluations --- p.46Chapter 4.5 --- Chapter Summary --- p.50Chapter 5 --- Feasibility Study of Multi MLPs --- p.51Chapter 5.1 --- Motivation --- p.52Chapter 5.2 --- Overall Architecture --- p.53Chapter 5.3 --- Experimental settings --- p.55Chapter 5.4 --- Experimental results and evaluations --- p.59Chapter 5.5 --- Chapter Summary --- p.59Chapter 6 --- A Hybridized GPPLCS --- p.61Chapter 6.1 --- Motivation --- p.62Chapter 6.2 --- Overall system architecture --- p.62Chapter 6.3 --- Experimental settings --- p.64Chapter 6.4 --- Experimental results and evaluations --- p.66Chapter 6.5 --- Chapter Summary --- p.70Chapter 7 --- A Memetic GPPLCS --- p.71Chapter 7.1 --- Motivation --- p.72Chapter 7.2 --- Overall system architecture --- p.72Chapter 7.3 --- Experimental settings --- p.76Chapter 7.4 --- Experimental results and evaluations --- p.77Chapter 7.5 --- Chapter Summary --- p.80Chapter 8 --- Conclusion --- p.82Chapter 8.1 --- Future work --- p.83Bibliography --- p.8

    Swarm Intelligence for Digital Circuits Implementation on Field Programmable Gate Arrays Platforms

    Get PDF
    Field programmable gate arrays (FPGAs) are becoming increasingly important implementation platforms for digital circuits. One of the necessary requirements to effectively utilize the FPGA\u27s resources is an efficient placement and routing mechanism. This paper presents an optimization technique based on swarm intelligence for FPGA placement and routing. Mentor graphics technology mapping netlist file is used to generate initial FPGA placements and routings which are then optimized by particle swarm optimization (PSO). Results for the implementation of a binary coded decimal bidirectional counter and an arithmetic logic unit on a Xilinx FPGA show that PSO is a potential technique for solving the placement and routing problem

    Hardware acceleration of the trace transform for vision applications

    Get PDF
    Computer Vision is a rapidly developing field in which machines process visual data to extract meaningful information. Digitised images in their pixels and bits serve no purpose of their own. It is only by interpreting the data, and extracting higher level information that a scene can be understood. The algorithms that enable this process are often complex, and data-intensive, limiting the processing rate when implemented in software. Hardware-accelerated implementations provide a significant performance boost that can enable real- time processing. The Trace Transform is a newly proposed algorithm that has been proven effective in image categorisation and recognition tasks. It is flexibly defined allowing the mathematical details to be tailored to the target application. However, it is highly computationally intensive, which limits its applications. Modern heterogeneous FPGAs provide an ideal platform for accelerating the Trace transform for real-time performance, while also allowing an element of flexibility, which highly suits the generality of the Trace transform. This thesis details the implementation of an extensible Trace transform architecture for vision applications, before extending this architecture to a full flexible platform suited to the exploration of Trace transform applications. As part of the work presented, a general set of architectures for large-windowed median and weighted median filters are presented as required for a number of Trace transform implementations. Finally an acceleration of Pseudo 2-Dimensional Hidden Markov Model decoding, usable in a person detection system, is presented. Such a system can be used to extract frames of interest from a video sequence, to be subsequently processed by the Trace transform. All these architectures emphasise the need for considered, platform-driven design in achieving maximum performance through hardware acceleration

    MFPA: Mixed-Signal Field Programmable Array for Energy-Aware Compressive Signal Processing

    Get PDF
    Compressive Sensing (CS) is a signal processing technique which reduces the number of samples taken per frame to decrease energy, storage, and data transmission overheads, as well as reducing time taken for data acquisition in time-critical applications. The tradeoff in such an approach is increased complexity of signal reconstruction. While several algorithms have been developed for CS signal reconstruction, hardware implementation of these algorithms is still an area of active research. Prior work has sought to utilize parallelism available in reconstruction algorithms to minimize hardware overheads; however, such approaches are limited by the underlying limitations in CMOS technology. Herein, the MFPA (Mixed-signal Field Programmable Array) approach is presented as a hybrid spin-CMOS reconfigurable fabric specifically designed for implementation of CS data sampling and signal reconstruction. The resulting fabric consists of 1) slice-organized analog blocks providing amplifiers, transistors, capacitors, and Magnetic Tunnel Junctions (MTJs) which are configurable to achieving square/square root operations required for calculating vector norms, 2) digital functional blocks which feature 6-input clockless lookup tables for computation of matrix inverse, and 3) an MRAM-based nonvolatile crossbar array for carrying out low-energy matrix-vector multiplication operations. The various functional blocks are connected via a global interconnect and spin-based analog-to-digital converters. Simulation results demonstrate significant energy and area benefits compared to equivalent CMOS digital implementations for each of the functional blocks used: this includes an 80% reduction in energy and 97% reduction in transistor count for the nonvolatile crossbar array, 80% standby power reduction and 25% reduced area footprint for the clockless lookup tables, and roughly 97% reduction in transistor count for a multiplier built using components from the analog blocks. Moreover, the proposed fabric yields 77% energy reduction compared to CMOS when used to implement CS reconstruction, in addition to latency improvements

    Applications of reprogrammability in algorithm acceleration

    Get PDF
    This doctoral thesis consists of an introductory part and eight appended publications, which deal with hardware-based reprogrammability in algorithm acceleration with a specific emphasis on the possibilities offered by modern large-scale Field Programmable Gate Arrays (FPGAs) in computationally demanding applications. The historical evolution of both the theoretical and technological paths culminating in the introduction of reprogrammable logic devices is first outlined. This is followed by defining the commonly used terms in the thesis. The reprogrammable logic market is surveyed, and the architectural structures and the technological reasonings behind them are described in detail. As reprogrammable logic lies between Application Specific Integrated Circuits (ASICs) and general-purpose microprocessors in the implementation spectrum of electronics systems, special attention has been paid to differentiate these three implementation approaches. This has been done to emphasize, that reprogrammable logic offers much more than just a low-volume replacement for ASICs. Design systems for reprogrammable logic are investigated, as the learning curve associated with them is the main hurdle for software-oriented designers for using reprogrammable logic devices. The theoretically important topic of partial reprogrammability is described in detail, but it is concluded, that the practical problems in designing viable development platforms for partially reprogrammable systems will hinder its wide-spread adoption. The main technical, design-oriented, and economic applicability factors of reprogrammable logic are laid out. The main advantages of reprogrammable logic are their suitability for fine-grained bit-level parallelizable computing with a short time-to-market and low upfront costs. It is also concluded, that the main opportunities for reprogrammable logic lie in the potential of high-level design systems, and the ever-growing ASIC design gap. On the other hand, most power-conscious mass-market portable products do not seem to offer major new market potential for reprogrammable logic. The appended publications are examined and compared to contemporaneous research at other research institutions. The conclusion is that for relatively wide classes of well-defined computation problems, reprogrammable logic offers a more efficient solution than a software-centered approach, with a much shorter production cycle than is the case with ASICs.reviewe
    corecore