712 research outputs found

    A comprehensive comparison between design for testability techniques for total dose testing of flash-based FPGAs

    Get PDF
    Radiation sources exist in different kinds of environments where electronic devices often operate. Correct device operation is usually affected negatively by radiation. The radiation resultant effect manifests in several forms depending on the operating environment of the device like total ionizing dose effect (TID), or single event effects (SEEs) such as single event upset (SEU), single event gate rupture (SEGR), and single event latch up (SEL). CMOS circuits and Floating gate MOS circuits suffer from an increase in the delay and the leakage current due to TID effect. This may damage the proper operation of the integrated circuit. Exhaustive testing is needed for devices operating in harsh conditions like space and military applications to ensure correct operations in the worst circumstances. The use of worst case test vectors (WCTVs) for testing is strongly recommended by MIL-STD-883, method 1019, which is the standard describing the procedure for testing electronic devices under radiation. However, the difficulty of generating these test vectors hinders their use in radiation testing. Testing digital circuits in the industry is usually done nowadays using design for testability (DFT) techniques as they are very mature and can be relied on. DFT techniques include, but not limited to, ad-hoc technique, built-in self test (BIST), muxed D scan, clocked scan and enhanced scan. DFT is usually used with automatic test patterns generation (ATPG) software to generate test vectors to test application specific integrated circuits (ASICs), especially with sequential circuits, against faults like stuck at faults and path delay faults. Despite all these recommendations for DFT, radiation testing has not benefited from this reliable technology yet. Also, with the big variation in the DFT techniques, choosing the right technique is the bottleneck to achieve the best results for TID testing. In this thesis, a comprehensive comparison between different DFT techniques for TID testing of flash-based FPGAs is made to help designers choose the best suitable DFT technique depending on their application. The comparison includes muxed D scan technique, clocked scan technique and enhanced scan technique. The comparison is done using ISCAS-89 benchmarks circuits. Points of comparisons include FPGA resources utilization, difficulty of designs bring-up, added delay by DFT logic and robust testable paths in each technique

    High level behavioural modelling of boundary scan architecture.

    Get PDF
    This project involves the development of a software tool which enables the integration of the IEEE 1149.1/JTAG Boundary Scan Test Architecture automatically into an ASIC (Application Specific Integrated Circuit) design. The tool requires the original design (the ASIC) to be described in VHDL-IEEE 1076 Hardware Description Language. The tool consists of the two major elements: i) A parsing and insertion algorithm developed and implemented in 'C'; ii) A high level model of the Boundary Scan Test Architecture implemented in 'VHDL'. The parsing and insertion algorithm is developed to deal with identifying the design Input/Output (I/O) terminals, their types and the order they appear in the ASIC design. It then attaches suitable Boundary Scan Cells to each I/O, except power and ground and inserts the high level models of the full Boundary Scan Architecture into the ASIC without altering the design core structure

    The Efficient Design of Time-to-Digital Converters

    Get PDF

    Testing of leakage current failure in ASIC devices exposed to total ionizing dose environment using design for testability techniques

    Get PDF
    Due to the advancements in technology, electronic devices have been relied upon to operate under harsh conditions. Radiation is one of the main causes of different failures of the electronics devices. According to the operation environment, the sources of the radiation can be terrestrial or extra-terrestrial. For terrestrial the devices can be used in nuclear reactors or biomedical devices where the radiation is man-made. While for the extra- terrestrial, the devices can be used in satellites, the international space station or spaceships, where the radiation comes from various sources like the Sun. According to the operation environment the effects of radiation differ. These effects falls under two categories, total ionizing dose effect (TID) and single event effects (SEEs). TID effects can be affect the delay and leakage current of CMOS circuits negatively. The affects can therefore hinder the integrated circuits\u27 operation. Before the circuits are used, particularly in critical radiation heavy applications like military and space, testing under radiation must be done to avoid any failures during operation. The standard in testing electronic devices is generating worst case test vectors (WCTVs) and under radiation using these vectors the circuits are tested. However, the generation of these WCTVs have been very challenging so this approach is rarely used for TIDs effects. Design for testability (DFT) have been widely used in the industry for digital circuits testing applications. DFT is usually used with automatic test patterns generation software to generate test vectors against fault models of manufacturer defects for application specific integrated circuit (ASIC.) However, it was never used to generate test vectors for leakage current testing induced in ASICs exposed to TID radiation environment. The purpose of the thesis is to use DFT to identify WCTVs for leakage current failures in sequential circuits for ASIC devices exposed to TID. A novel methodology was devised to identify these test vectors. The methodology is validated and compared to previous non DFT methods. The methodology is proven to overcome the limitation of previous methodologies

    Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 2: Army fault tolerant architecture design and analysis

    Get PDF
    Described here is the Army Fault Tolerant Architecture (AFTA) hardware architecture and components and the operating system. The architectural and operational theory of the AFTA Fault Tolerant Data Bus is discussed. The test and maintenance strategy developed for use in fielded AFTA installations is presented. An approach to be used in reducing the probability of AFTA failure due to common mode faults is described. Analytical models for AFTA performance, reliability, availability, life cycle cost, weight, power, and volume are developed. An approach is presented for using VHSIC Hardware Description Language (VHDL) to describe and design AFTA's developmental hardware. A plan is described for verifying and validating key AFTA concepts during the Dem/Val phase. Analytical models and partial mission requirements are used to generate AFTA configurations for the TF/TA/NOE and Ground Vehicle missions

    Energy-Efficient Recurrent Neural Network Accelerators for Real-Time Inference

    Full text link
    Over the past decade, Deep Learning (DL) and Deep Neural Network (DNN) have gone through a rapid development. They are now vastly applied to various applications and have profoundly changed the life of hu- man beings. As an essential element of DNN, Recurrent Neural Networks (RNN) are helpful in processing time-sequential data and are widely used in applications such as speech recognition and machine translation. RNNs are difficult to compute because of their massive arithmetic operations and large memory footprint. RNN inference workloads used to be executed on conventional general-purpose processors including Central Processing Units (CPU) and Graphics Processing Units (GPU); however, they have un- necessary hardware blocks for RNN computation such as branch predictor, caching system, making them not optimal for RNN processing. To accelerate RNN computations and outperform the performance of conventional processors, previous work focused on optimization methods on both software and hardware. On the software side, previous works mainly used model compression to reduce the memory footprint and the arithmetic operations of RNNs. On the hardware side, previous works also designed domain-specific hardware accelerators based on Field Pro- grammable Gate Arrays (FPGA) or Application Specific Integrated Circuits (ASIC) with customized hardware pipelines optimized for efficient pro- cessing of RNNs. By following this software-hardware co-design strategy, previous works achieved at least 10X speedup over conventional processors. Many previous works focused on achieving high throughput with a large batch of input streams. However, in real-time applications, such as gaming Artificial Intellegence (AI), dynamical system control, low latency is more critical. Moreover, there is a trend of offloading neural network workloads to edge devices to provide a better user experience and privacy protection. Edge devices, such as mobile phones and wearable devices, are usually resource-constrained with a tight power budget. They require RNN hard- ware that is more energy-efficient to realize both low-latency inference and long battery life. Brain neurons have sparsity in both the spatial domain and time domain. Inspired by this human nature, previous work mainly explored model compression to induce spatial sparsity in RNNs. The delta network algorithm alternatively induces temporal sparsity in RNNs and can save over 10X arithmetic operations in RNNs proven by previous works. In this work, we have proposed customized hardware accelerators to exploit temporal sparsity in Gated Recurrent Unit (GRU)-RNNs and Long Short-Term Memory (LSTM)-RNNs to achieve energy-efficient real-time RNN inference. First, we have proposed DeltaRNN, the first-ever RNN accelerator to exploit temporal sparsity in GRU-RNNs. DeltaRNN has achieved 1.2 TOp/s effective throughput with a batch size of 1, which is 15X higher than its related works. Second, we have designed EdgeDRNN to accelerate GRU-RNN edge inference. Compared to DeltaRNN, EdgeDRNN does not rely on on-chip memory to store RNN weights and focuses on reducing off-chip Dynamic Random Access Memory (DRAM) data traffic using a more scalable architecture. EdgeDRNN have realized real-time inference of large GRU-RNNs with submillisecond latency and only 2.3 W wall plug power consumption, achieving 4X higher energy efficiency than commercial edge AI platforms like NVIDIA Jetson Nano. Third, we have used DeltaRNN to realize the first-ever continuous speech recognition sys- tem with the Dynamic Audio Sensor (DAS) as the front-end. The DAS is a neuromorphic event-driven sensor that produces a stream of asyn- chronous events instead of audio data sampled at a fixed sample rate. We have also showcased how an RNN accelerator can be integrated with an event-driven sensor on the same chip to realize ultra-low-power Keyword Spotting (KWS) on the extreme edge. Fourth, we have used EdgeDRNN to control a powered robotic prosthesis using an RNN controller to replace a conventional proportional–derivative (PD) controller. EdgeDRNN has achieved 21 μs latency of running the RNN controller and could maintain stable control of the prosthesis. We have used DeltaRNN and EdgeDRNN to solve these problems to prove their value in solving real-world problems. Finally, we have applied the delta network algorithm on LSTM-RNNs and have combined it with a customized structured pruning method, called Column-Balanced Targeted Dropout (CBTD), to induce spatio-temporal sparsity in LSTM-RNNs. Then, we have proposed another FPGA-based accelerator called Spartus, the first RNN accelerator that exploits spatio- temporal sparsity. Spartus achieved 9.4 TOp/s effective throughput with a batch size of 1, the highest among present FPGA-based RNN accelerators with a power budget around 10 W. Spartus can complete the inference of an LSTM layer having 5 million parameters within 1 μs

    Unified Synchronized Data Acquisition Networks

    Full text link
    The permanently evolving technical area of communication technology and the presence of more and more precise sensors and detectors, enable options and solutions to challenges in science and industry. In high-energy physics, for example, it becomes possible with accurate measurements to observe particles almost at the speed of light in small-sized dimensions. Thereby, the enormous amounts of gathered data require modern high performance communication networks. Potential and efficient implementation of future readout chains will depend on new concepts and mechanisms. The main goals of this dissertation are to create new efficient synchronization mechanisms and to evolve readout systems for optimization of future sensor and detector systems. This happens in the context of the Compressed Baryonic Matter experiment, which is a part of the Facility for Antiproton and Ion Research, an international accelerator facility. It extends an accelerator complex in Darmstadt at the GSI Helmholtzzentrum für Schwerionenforschung GmbH. Initially, the challenges are specified and an analysis of the state of the art is presented. The resulting constraints and requirements influenced the design and development described within this dissertation. Subsequently, the different design and implementation tasks are discussed. Starting with the basic detector read system requirements and the definition of an efficient communication protocol. This protocol delivers all features needed for building of compact and efficient readout systems. Therefore, it is advantageous to use a single unified connection for processing all communication traffic. This means not only data, control, and synchronization messages, but also clock distribution is handled. Furthermore, all links in this system have a deterministic latency. The deterministic behavior enables establishing a synchronous network. Emerging problems were solved and the concept was successfully implemented and tested during several test beam times. In addition, the implementation and integration of this communication methodology into different network devices is described. Therefore, a generic modular approach was created. This enhances ASIC development by supporting them with proven hardware IPs, reducing design time, and risk of failure. Furthermore, this approach delivers flexibility concerning data rate and structure for the network system. Additionally, the design and prototyping for a data aggregation and concentrator ASIC is described. In conjunction with a dense electrical to optical conversion, this ASIC enables communication with flexible readout structures for the experiment and delivers the planned capacities and bandwidth. In the last part of the work, analysis and transfer of the created innovative synchronization mechanism into the area of high performance computing is discussed. Finally, a conclusion of all reached results and an outlook of possible future activities and research tasks within the Compressed Baryonic Matter experiment are presented

    Cost modelling and concurrent engineering for testable design

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.As integrated circuits and printed circuit boards increase in complexity, testing becomes a major cost factor of the design and production of the complex devices. Testability has to be considered during the design of complex electronic systems, and automatic test systems have to be used in order to facilitate the test. This fact is now widely accepted in industry. Both design for testability and the usage of automatic test systems aim at reducing the cost of production testing or, sometimes, making it possible at all. Many design for testability methods and test systems are available which can be configured into a production test strategy, in order to achieve high quality of the final product. The designer has to select from the various options for creating a test strategy, by maximising the quality and minimising the total cost for the electronic system. This thesis presents a methodology for test strategy generation which is based on consideration of the economics during the life cycle of the electronic system. This methodology is a concurrent engineering approach which takes into account all effects of a test strategy on the electronic system during its life cycle by evaluating its related cost. This objective methodology is used in an original test strategy planning advisory system, which allows for test strategy planning for VLSI circuits as well as for digital electronic systems. The cost models which are used for evaluating the economics of test strategies are described in detail and the test strategy planning system is presented. A methodology for making decisions which are based on estimated costing data is presented. Results of using the cost models and the test strategy planning system for evaluating the economics of test strategies for selected industrial designs are presented

    Microchannel cooling for the LHCb VELO Upgrade I

    Get PDF
    The LHCb VELO Upgrade I, currently being installed for the 2022 start of LHC Run 3, uses silicon microchannel coolers with internally circulating bi-phase \cotwo for thermal control of hybrid pixel modules operating in vacuum. This is the largest scale application of this technology to date. Production of the microchannel coolers was completed in July 2019 and the assembly into cooling structures was completed in September 2021. This paper describes the R\&D path supporting the microchannel production and assembly and the motivation for the design choices. The microchannel coolers have excellent thermal peformance, low and uniform mass, no thermal expansion mismatch with the ASICs and are radiation hard. The fluidic and thermal performance is presented.Comment: 31 pages, 27 figure
    corecore