258 research outputs found

    H-SIMD machine : configurable parallel computing for data-intensive applications

    Get PDF
    This dissertation presents a hierarchical single-instruction multiple-data (H-SLMD) configurable computing architecture to facilitate the efficient execution of data-intensive applications on field-programmable gate arrays (FPGAs). H-SIMD targets data-intensive applications for FPGA-based system designs. The H-SIMD machine is associated with a hierarchical instruction set architecture (HISA) which is developed for each application. The main objectives of this work are to facilitate ease of program development and high performance through ease of scheduling operations and overlapping communications with computations. The H-SIMD machine is composed of the host, FPGA and nano-processor layers. They execute host SIMD instructions (HSIs), FPGA SIMD instructions (FSIs) and nano-processor instructions (NPLs), respectively. A distinction between communication and computation instructions is intended for all the HISA layers. The H-SIMD machine also employs a memory switching scheme to bridge the omnipresent large bandwidth gaps in configurable systems. To showcase the proposed high-performance approach, the conditions to fully overlap communications with computations are investigated for important applications. The building blocks in the H-SLMD machine, such as high-performance and area-efficient register files, are presented in detail. The H-SLMD machine hierarchy is implemented on a host Dell workstation and the Annapolis Wildstar II FPGA board. Significant speedups have been achieved for matrix multiplication (MM), 2-dimensional discrete cosine transform (2D DCT) and 2-dimensional fast Fourier transform (2D FFT) which are used widely in science and engineering. In another FPGA-based programming paradigm, a high-level language (here ANSI C) can be used to program the FPGAs in a mode similar to that of the H-SIMD machine in terms of trying to minimize the effect of overheads. More specifically, a multi-threaded overlapping scheme is proposed to reduce as much as possible, or even completely hide, runtime FPGA reconfiguration overheads. Nevertheless, although the HLL-enabled reconfigurable machine allows software developers to customize FPGA functions easily, special architecture techniques are needed to achieve high-performance without significant penalty on area and clock frequency. Two important high-performance applications, matrix multiplication and image edge detection, are tested on the SRC-6 reconfigurable machine. The implemented algorithms are able to exploit the available data parallelism with independent functional units and application-specific cache support. Relevant performance and design tradeoffs are analyzed

    Network Interface Design for Network-on-Chip

    Get PDF
    In the culture of globalized integrated circuit (IC, a.k.a chip) production, the use of Intellectual Property (IP) cores, computer aided design tools (CAD) and testing services from un-trusted vendors are prevalent to reduce the time to market. Unfortunately, the globalized business model potentially creates opportunities for hardware tampering and modification from adversary, and this tampering is known as hardware Trojan (HT). Network-on-chip (NoC) has emerged as an efficient on-chip communication infrastructure. In this work, the security aspects of NoC network interface (NI), one of the most critical components in NoC will be investigated and presented. Particularly, the NI design, hardware attack models and countermeasures for NI in a NoC system are explored. An OCP compatible NI is implemented in an IBM0.18ìm CMOS technology. The synthesis results are presented and compared with existing literature. Second, comprehensive hardware attack models targeted for NI are presented from system level to circuit level. The impact of hardware Trojans on NoC functionality and performance are evaluated. Finally, a countermeasure method is proposed to address the hardware attacks in NIs

    Null Convention Logic applications of asynchronous design in nanotechnology and cryptographic security

    Get PDF
    This dissertation presents two Null Convention Logic (NCL) applications of asynchronous logic circuit design in nanotechnology and cryptographic security. The first application is the Asynchronous Nanowire Reconfigurable Crossbar Architecture (ANRCA); the second one is an asynchronous S-Box design for cryptographic system against Side-Channel Attacks (SCA). The following are the contributions of the first application: 1) Proposed a diode- and resistor-based ANRCA (DR-ANRCA). Three configurable logic block (CLB) structures were designed to efficiently reconfigure a given DR-PGMB as one of the 27 arbitrary NCL threshold gates. A hierarchical architecture was also proposed to implement the higher level logic that requires a large number of DR-PGMBs, such as multiple-bit NCL registers. 2) Proposed a memristor look-up-table based ANRCA (MLUT-ANRCA). An equivalent circuit simulation model has been presented in VHDL and simulated in Quartus II. Meanwhile, the comparison between these two ANRCAs have been analyzed numerically. 3) Presented the defect-tolerance and repair strategies for both DR-ANRCA and MLUT-ANRCA. The following are the contributions of the second application: 1) Designed an NCL based S-Box for Advanced Encryption Standard (AES). Functional verification has been done using Modelsim and Field-Programmable Gate Array (FPGA). 2) Implemented two different power analysis attacks on both NCL S-Box and conventional synchronous S-Box. 3) Developed a novel approach based on stochastic logics to enhance the resistance against DPA and CPA attacks. The functionality of the proposed design has been verified using an 8-bit AES S-box design. The effects of decision weight, bitstream length, and input repetition times on error rates have been also studied. Experimental results shows that the proposed approach enhances the resistance to against the CPA attack by successfully protecting the hidden key --Abstract, page iii

    inSense: A Variation and Fault Tolerant Architecture for Nanoscale Devices

    Get PDF
    Transistor technology scaling has been the driving force in improving the size, speed, and power consumption of digital systems. As devices approach atomic size, however, their reliability and performance are increasingly compromised due to reduced noise margins, difficulties in fabrication, and emergent nano-scale phenomena. Scaled CMOS devices, in particular, suffer from process variations such as random dopant fluctuation (RDF) and line edge roughness (LER), transistor degradation mechanisms such as negative-bias temperature instability (NBTI) and hot-carrier injection (HCI), and increased sensitivity to single event upsets (SEUs). Consequently, future devices may exhibit reduced performance, diminished lifetimes, and poor reliability. This research proposes a variation and fault tolerant architecture, the inSense architecture, as a circuit-level solution to the problems induced by the aforementioned phenomena. The inSense architecture entails augmenting circuits with introspective and sensory capabilities which are able to dynamically detect and compensate for process variations, transistor degradation, and soft errors. This approach creates ``smart\u27\u27 circuits able to function despite the use of unreliable devices and is applicable to current CMOS technology as well as next-generation devices using new materials and structures. Furthermore, this work presents an automated prototype implementation of the inSense architecture targeted to CMOS devices and is evaluated via implementation in ISCAS \u2785 benchmark circuits. The automated prototype implementation is functionally verified and characterized: it is found that error detection capability (with error windows from ≈\approx30-400ps) can be added for less than 2\% area overhead for circuits of non-trivial complexity. Single event transient (SET) detection capability (configurable with target set-points) is found to be functional, although it generally tracks the standard DMR implementation with respect to overheads

    Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 2: Army fault tolerant architecture design and analysis

    Get PDF
    Described here is the Army Fault Tolerant Architecture (AFTA) hardware architecture and components and the operating system. The architectural and operational theory of the AFTA Fault Tolerant Data Bus is discussed. The test and maintenance strategy developed for use in fielded AFTA installations is presented. An approach to be used in reducing the probability of AFTA failure due to common mode faults is described. Analytical models for AFTA performance, reliability, availability, life cycle cost, weight, power, and volume are developed. An approach is presented for using VHSIC Hardware Description Language (VHDL) to describe and design AFTA's developmental hardware. A plan is described for verifying and validating key AFTA concepts during the Dem/Val phase. Analytical models and partial mission requirements are used to generate AFTA configurations for the TF/TA/NOE and Ground Vehicle missions

    ReDO: Cross-Layer Multi-Objective Design-Exploration Framework for Efficient Soft Error Resilient Systems

    Get PDF
    Designing soft errors resilient systems is a complex engineering task, which nowadays follows a cross-layer approach. It requires a careful planning for different fault-tolerance mechanisms at different system's layers: starting from the technology up to the software domain. While these design decisions have a positive effect on the reliability of the system, they usually have a detrimental effect on its size, power consumption, performance and cost. Design space exploration for cross-layer reliability is therefore a multi-objective search problem in which reliability must be traded-off with other design dimensions. This paper proposes a cross-layer multi-objective design space exploration algorithm developed to help designers when building soft error resilient electronic systems. The algorithm exploits a system-level Bayesian reliability estimation model to analyze the effect of different cross-layer combinations of protection mechanisms on the reliability of the full system. A new heuristic based on the extremal optimization theory is used to efficiently explore the design space. An extended set of simulations shows the capability of this framework when applied both to benchmark applications and realistic systems, providing optimized systems that outperform those obtained by applying state-of-the-art cross-layer reliability techniques

    Reconfigurable microarchitectures at the programmable logic interface

    Get PDF

    The detector control system of the ATLAS experiment

    Get PDF
    The ATLAS experiment is one of the experiments at the Large Hadron Collider, constructed to study elementary particle interactions in collisions of high-energy proton beams. The individual detector components as well as the common experimental infrastructure are supervised by the Detector Control System (DCS). The DCS enables equipment supervision using operator commands, reads, processes and archives the operational parameters of the detector, allows for error recognition and handling, manages the communication with external control systems, and provides a synchronization mechanism with the physics data acquisition system. Given the enormous size and complexity of ATLAS, special emphasis was put on the use of standardized hardware and software components enabling efficient development and long-term maintainability of the DCS over the lifetime of the experiment. Currently, the DCS is being used successfully during the experiment commissioning phase

    Verification of the StarT-voyager bus interface units

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 64).by Wing-Chung Ho.M.S
    • …
    corecore