Search CORE

10 research outputs found

Sustainable Fault-handling Of Reconfigurable Logic Using Throughput-driven Assessment

Author: Sharma Carthik
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2008
Field of study

A sustainable Evolvable Hardware (EH) system is developed for SRAM-based reconfigurable Field Programmable Gate Arrays (FPGAs) using outlier detection and group testing-based assessment principles. The fault diagnosis methods presented herein leverage throughput-driven, relative fitness assessment to maintain resource viability autonomously. Group testing-based techniques are developed for adaptive input-driven fault isolation in FPGAs, without the need for exhaustive testing or coding-based evaluation. The techniques maintain the device operational, and when possible generate validated outputs throughout the repair process. Adaptive fault isolation methods based on discrepancy-enabled pair-wise comparisons are developed. By observing the discrepancy characteristics of multiple Concurrent Error Detection (CED) configurations, a method for robust detection of faults is developed based on pairwise parallel evaluation using Discrepancy Mirror logic. The results from the analytical FPGA model are demonstrated via a self-healing, self-organizing evolvable hardware system. Reconfigurability of the SRAM-based FPGA is leveraged to identify logic resource faults which are successively excluded by group testing using alternate device configurations. This simplifies the system architect\u27s role to definition of functionality using a high-level Hardware Description Language (HDL) and system-level performance versus availability operating point. System availability, throughput, and mean time to isolate faults are monitored and maintained using an Observer-Controller model. Results are demonstrated using a Data Encryption Standard (DES) core that occupies approximately 305 FPGA slices on a Xilinx Virtex-II Pro FPGA. With a single simulated stuck-at-fault, the system identifies a completely validated replacement configuration within three to five positive tests. The approach demonstrates a readily-implemented yet robust organic hardware application framework featuring a high degree of autonomous self-control

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

MFPA: Mixed-Signal Field Programmable Array for Energy-Aware Compressive Signal Processing

Author: Tatulian Adrian
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2020
Field of study

Compressive Sensing (CS) is a signal processing technique which reduces the number of samples taken per frame to decrease energy, storage, and data transmission overheads, as well as reducing time taken for data acquisition in time-critical applications. The tradeoff in such an approach is increased complexity of signal reconstruction. While several algorithms have been developed for CS signal reconstruction, hardware implementation of these algorithms is still an area of active research. Prior work has sought to utilize parallelism available in reconstruction algorithms to minimize hardware overheads; however, such approaches are limited by the underlying limitations in CMOS technology. Herein, the MFPA (Mixed-signal Field Programmable Array) approach is presented as a hybrid spin-CMOS reconfigurable fabric specifically designed for implementation of CS data sampling and signal reconstruction. The resulting fabric consists of 1) slice-organized analog blocks providing amplifiers, transistors, capacitors, and Magnetic Tunnel Junctions (MTJs) which are configurable to achieving square/square root operations required for calculating vector norms, 2) digital functional blocks which feature 6-input clockless lookup tables for computation of matrix inverse, and 3) an MRAM-based nonvolatile crossbar array for carrying out low-energy matrix-vector multiplication operations. The various functional blocks are connected via a global interconnect and spin-based analog-to-digital converters. Simulation results demonstrate significant energy and area benefits compared to equivalent CMOS digital implementations for each of the functional blocks used: this includes an 80% reduction in energy and 97% reduction in transistor count for the nonvolatile crossbar array, 80% standby power reduction and 25% reduced area footprint for the clockless lookup tables, and roughly 97% reduction in transistor count for a multiplier built using components from the analog blocks. Moreover, the proposed fabric yields 77% energy reduction compared to CMOS when used to implement CS reconstruction, in addition to latency improvements

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack

Author: Imran Naveed
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2013
Field of study

Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Self-Scaling Evolution of Analog Computation Circuits

Author: Pyle Steven
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2015
Field of study

Energy and performance improvements of continuous-time analog-based computation for selected applications offer an avenue to continue improving the computational ability of tomorrow*s electronic devices at current technology scaling limits. However, analog computation is plagued by the difficulty of designing complex computational circuits, programmability, as well as the inherent lack of accuracy and precision when compared to digital implementations. In this thesis, evolutionary algorithm-based techniques are utilized within a reconfigurable analog fabric to realize an automated method of designing analog-based computational circuits while adapting the functional range to improve performance. A Self-Scaling Genetic Algorithm is proposed to adapt solutions to computationally-tractable ranges in hardware-constrained analog reconfigurable fabrics. It operates by utilizing a Particle Swarm Optimization (PSO) algorithm that operates synergistically with a Genetic Algorithm (GA) to adaptively scale and translate the functional range of computational circuits composed of high-level or low-level Computational Analog Elements to improve performance and realize functionality otherwise unobtainable on the intrinsic platform. The technique is demonstrated by evolving square, square-root, cube, and cube-root analog computational circuits on the Cypress PSoC-5LP System-on-Chip. Results indicate that the Self-Scaling Genetic Algorithm improves our error metric on average 7.18-fold, up to 12.92-fold for computational circuits that produce outputs beyond device range. Results were also favorable compared to previous works, which utilized extrinsic evolution of circuits with much greater complexity than was possible on the PSoC-5LP

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Leveraging Signal Transfer Characteristics and Parasitics of Spintronic Circuits for Area and Energy-Optimized Hybrid Digital and Analog Arithmetic

Author: Tatulian Adrian
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2023
Field of study

While Internet of Things (IoT) sensors offer numerous benefits in diverse applications, they are limited by stringent constraints in energy, processing area and memory. These constraints are especially challenging within applications such as Compressive Sensing (CS) and Machine Learning (ML) via Deep Neural Networks (DNNs), which require dot product computations on large data sets. A solution to these challenges has been offered by the development of crossbar array architectures, enabled by recent advances in spintronic devices such as Magnetic Tunnel Junctions (MTJs). Crossbar arrays offer a compact, low-energy and in-memory approach to dot product computation in the analog domain by leveraging intrinsic signal-transfer characteristics of the embedded MTJ devices. The first phase of this dissertation research seeks to build on these benefits by optimizing resource allocation within spintronic crossbar arrays. A hardware approach to non-uniform CS is developed, which dynamically configures sampling rates by deriving necessary control signals using circuit parasitics. Next, an alternate approach to non-uniform CS based on adaptive quantization is developed, which reduces circuit area in addition to energy consumption. Adaptive quantization is then applied to DNNs by developing an architecture allowing for layer-wise quantization based on relative robustness levels. The second phase of this research focuses on extension of the analog computation paradigm by development of an operational amplifier-based arithmetic unit for generalized scalar operations. This approach allows for 95% area reduction in scalar multiplications, compared to the state-of-the-art digital alternative. Moreover, analog computation of enhanced activation functions allows for significant improvement in DNN accuracy, which can be harnessed through triple modular redundancy to yield 81.2% reduction in power at the cost of only 4% accuracy loss, compared to a larger network. Together these results substantiate promising approaches to several challenges facing the design of future IoT sensors within the targeted applications of CS and ML

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Heterogeneous Reconfigurable Fabrics for In-circuit Training and Evaluation of Neuromorphic Architectures

Author: Mohammadizand Ramtin
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/05/2019
Field of study

A heterogeneous device technology reconfigurable logic fabric is proposed which leverages the cooperating advantages of distinct magnetic random access memory (MRAM)-based look-up tables (LUTs) to realize sequential logic circuits, along with conventional SRAM-based LUTs to realize combinational logic paths. The resulting Hybrid Spin/Charge FPGA (HSC-FPGA) using magnetic tunnel junction (MTJ) devices within this topology demonstrates commensurate reductions in area and power consumption over fabrics having LUTs constructed with either individual technology alone. Herein, a hierarchical top-down design approach is used to develop the HSCFPGA starting from the configurable logic block (CLB) and slice structures down to LUT circuits and the corresponding device fabrication paradigms. This facilitates a novel architectural approach to reduce leakage energy, minimize communication occurrence and energy cost by eliminating unnecessary data transfer, and support auto-tuning for resilience. Furthermore, HSC-FPGA enables new advantages of technology co-design which trades off alternative mappings between emerging devices and transistors at runtime by allowing dynamic remapping to adaptively leverage the intrinsic computing features of each device technology. HSC-FPGA offers a platform for fine-grained Logic-In-Memory architectures and runtime adaptive hardware. An orthogonal dimension of fabric heterogeneity is also non-determinism enabled by either low-voltage CMOS or probabilistic emerging devices. It can be realized using probabilistic devices within a reconfigurable network to blend deterministic and probabilistic computational models. Herein, consider the probabilistic spin logic p-bit device as a fabric element comprising a crossbar-structured weighted array. The Programmability of the resistive network interconnecting p-bit devices can be achieved by modifying the resistive states of the array\u27s weighted connections. Thus, the programmable weighted array forms a CLB-scale macro co-processing element with bitstream programmability. This allows field programmability for a wide range of classification problems and recognition tasks to allow fluid mappings of probabilistic and deterministic computing approaches. In particular, a Deep Belief Network (DBN) is implemented in the field using recurrent layers of co-processing elements to form an n x m1 x m2 x ::: x mi weighted array as a configurable hardware circuit with an n-input layer followed by i ≥ 1 hidden layers. As neuromorphic architectures using post-CMOS devices increase in capability and network size, the utility and benefits of reconfigurable fabrics of neuromorphic modules can be anticipated to continue to accelerate

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Scalable Fpga Refurbishment Using Netlist-Driven Evolutionary Algorithms

Author: Ashraf Rizwan A.
Demara Ronald F.
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 22/07/2013
Field of study

In this work, Field-Programmable Gate Array (FPGA) reconfigurability is exploited to realize autonomous fault recovery in mission-critical applications at runtime. The proposed Netlist-Driven Evolutionary Refurbishment technique utilizes design-time information from the circuit netlist to constrain the search space of the algorithm by up to 98.1 percent in terms of the chromosome length representing reconfigurable logic elements. This facilitates refurbishment of relatively large-sized FPGA circuits as compared to previous works. Hence, the scalability issue associated with Evolvable Hardware-Based refurbishment is addressed and improved. Experiments are conducted with multiple circuits from the MCNC benchmark suite to validate the approach and assess its benefits and limitations. Successful refurbishment of the apex4 circuit having a total of 1,252 LUTs with 10 percent spares is achieved in as few as 633 generations on average when subjected to simulated randomly injected single stuck-at faults. Moreover, the use of design-time information about the circuit undergoing refurbishment is validated as means to increase the tractability of dynamic evolvable hardware techniques. © 1968-2012 IEEE

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Scalable FPGA Refurbishment Using Netlist-Driven Evolutionary Algorithms

Author
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2013
Field of study

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Scalable FPGA Refurbishment using Netlist-driven Evolutionary Algorithms

Author: R. F. Demara
Rizwan A. Ashraf
Senior Member
Publication venue
Publication date
Field of study

In this work, Field Programmable Gate Array (FPGA) reconfigurability is exploited to realize autonomous fault recovery in mission-critical applications at runtime. The proposed Netlist Driven Evolutionary Refurbishment (NDER) technique utilizes design-time information from the circuit netlist to constrain the search space of the algorithm by up to 98.1 % in terms of the chromosome length representing reconfigurable logic elements. This facilitates refurbishment of relatively large-sized FPGA circuits as compared to previous works. Hence, the scalability issue associated with Evolvable Hardware based refurbishment is addressed and improved. Experiments are conducted with multiple circuits from the MCNC benchmark suite to validate the approach and assess its benefits and limitations. Successful refurbishment of the apex4 circuit having a total of 1252 LUTs with 10 % spares is achieved in as few as 633 generations on average when subjected to simulated randomly-injected single stuck-at faults. Moreover, the use of design-time information about the circuit undergoing refurbishment is validated as means to increase the tractability of dynamic evolvable hardware techniques

CiteSeerX