2 research outputs found

    Investigation of radiation-hardened design of electronic systems with applications to post-accident monitoring for nuclear power plants

    Get PDF
    This research aims at improving the robustness of electronic systems used-in high level radiation environments by combining with radiation-hardened (rad-hardened) design and fault-tolerant techniques based on commercial off-the-shelf (COTS) components. A specific of the research is to use such systems for wireless post-accident monitoring in nuclear power plants (NPPs). More specifically, the following methods and systems are developed and investigated to accomplish expected research objectives: analysis of radiation responses, design of a radiation-tolerant system, implementation of a wireless post-accident monitoring system for NPPs, performance evaluation without repeat physical tests, and experimental validation in a radiation environment. A method is developed to analyze ionizing radiation responses of COTS-based devices and circuits in various radiation conditions, which can be applied to design circuits robust to ionizing radiation effects without repeated destructive tests in a physical radiation environment. Some mathematical models of semiconductor devices for post-irradiation conditions are investigated, and their radiation responses are analyzed using Technology Computer Aided Design (TCAD) simulator. Those models are then used in the analysis of circuits and systems under radiation condition. Based on the simulation results, method of rapid power off may be effectively to protect electronic systems under ionizing radiation. It can be a potential solution to mitigate damages of electronic components caused by radiation. With simulation studies of photocurrent responses of semiconductor devices, two methods are presented to mitigate the damages of total ionizing dose: component selection and radiation shielding protection. According to the investigation of radiation-tolerance of regular COTS components, most COTS-based semiconductor components may experience performance degradation and radiation damages when the total dose is greater than 20 K Rad (Si). A principle of component selection is given to obtain the suitable components, as well as a method is proposed to assess the component reliability under radiation environments, which uses radiation degradation factors, instead of the usual failure rate data in the reliability model. Radiation degradation factor is as the input to describe the radiation response of a component under a total radiation dose. In addition, a number of typical semiconductor components are also selected as the candidate components for the application of wireless monitoring in nuclear power plants. On the other hand, a multi-layer shielding protection is used to reduce the total dose to be less than 20 K Rad (Si) for a given radiation condition; the selected semiconductor devices can then survive in the radiation condition with the reduced total dose. The calculation method of required shielding thickness is also proposed to achieve the design objectives. Several shielding solutions are also developed and compared for applications in wireless monitoring system in nuclear power plants. A radiation-tolerant architecture is proposed to allow COTS-based electronic systems to be used in high-level radiation environments without using rad-hardened components. Regular COTS components are used with some fault-tolerant techniques to mitigate damages of the system through redundancy, online fault detection, real-time preventive remedial actions, and rapid power off. The functions of measurement, processing, communication, and fault-tolerance are integrated locally within all channels without additional detection units. A hardware emulation bench with redundant channels is constructed to verify the effectiveness of the developed radiation-tolerant architecture. Experimental results have shown that the developed architecture works effectively and redundant channels can switch smoothly in 500 milliseconds or less when a single fault or multiple faults occur. An online mechanism is also investigated to timely detect and diagnose radiation damages in the developed redundant architecture for its radiation tolerance enhancement. This is implemented by the built-in-test technique. A number of tests by using fault injection techniques have been carried out in the developed hardware emulation bench to validate the proposed detection mechanism. The test results have shown that faults and errors can be effectively detected and diagnosed. For the developed redundant wireless devices under given radiation dose (20 K Rad (Si)), the fault detection coverage is about 62.11%. This level of protection could be improved further by putting more resources (CPU consumption, etc.) into the function of fault detection, but the cost will increase. To apply the above investigated techniques and systems, under a severe accident condition in a nuclear power plant, a prototype of wireless post-accident monitoring system (WPAMS) is designed and constructed. Specifically, the radiation-tolerant wireless device is implemented with redundant and diversified channels. The developed system operates effectively to measure up-to-date information from a specific area/process and to transmit that information to remote monitoring station wirelessly. Hence, the correctness of the proposed architecture and approaches in this research has been successfully validated. In the design phase, an assessment method without performing repeated destructive physical tests is investigated to evaluate the radiation-tolerance of electronic systems by combining the evaluation of radiation protection and the analysis of the system reliability under the given radiation conditions. The results of the assessment studies have shown that, under given radiation conditions, the reliability of the developed radiation-tolerant wireless system can be much higher than those of non-redundant channels; and it can work in high-level radiation environments with total dose up to 1 M Rad (Si). Finally, a number of total dose tests are performed to investigate radiation effects induced by gamma radiation on distinct modern wireless monitoring devices. An experimental setup is developed to monitor the performance of signal measurement online and transmission of the developed distinct wireless electronic devices directly under gamma radiator at The Ohio State University Nuclear Reactor Lab (OSU-NRL). The gamma irradiator generates dose rates of 20 K Rad/h and 200 Rad/h on the samples, respectively. It was found that both measurement and transmission functions of distinct wireless measurement and transmission devices work well under gamma radiation conditions before the devices permanently damage. The experimental results have also shown that the developed radiation-tolerant design can be applied to effectively extend the lifespan of COTS-based electronic systems in the high-level radiation environment, as well as to improve the performance of wireless communication systems. According to testing results, the developed radiation-tolerant wireless device with a shielding protection can work at least 21 hours under the highest dose rate (20 K Rad/h). In summary, this research has addressed important issues on the design of radiation-tolerant systems without using rad-hardened electronic components. The proposed methods and systems provide an effective and economical solution to implement monitoring systems for obtaining up-to-date information in high-level radiation environments. The reported contributions are of significance both academically and in practice

    Dynamic reconfiguration frameworks for high-performance reliable real-time reconfigurable computing

    Get PDF
    The sheer hardware-based computational performance and programming flexibility offered by reconfigurable hardware like Field-Programmable Gate Arrays (FPGAs) make them attractive for computing in applications that require high performance, availability, reliability, real-time processing, and high efficiency. Fueled by fabrication process scaling, modern reconfigurable devices come with ever greater quantities of on-chip resources, allowing a more complex variety of applications to be developed. Thus, the trend is that technology giants like Microsoft, Amazon, and Baidu now embrace reconfigurable computing devices likes FPGAs to meet their critical computing needs. In addition, the capability to autonomously reprogramme these devices in the field is being exploited for reliability in application domains like aerospace, defence, military, and nuclear power stations. In such applications, real-time computing is important and is often a necessity for reliability. As such, applications and algorithms resident on these devices must be implemented with sufficient considerations for real-time processing and reliability. Often, to manage a reconfigurable hardware device as a computing platform for a multiplicity of homogenous and heterogeneous tasks, reconfigurable operating systems (ROSes) have been proposed to give a software look to hardware-based computation. The key requirements of a ROS include partitioning, task scheduling and allocation, task configuration or loading, and inter-task communication and synchronization. Existing ROSes have met these requirements to varied extents. However, they are limited in reliability, especially regarding the flexibility of placing the hardware circuits of tasks on device’s chip area, the problem arising more from the partitioning approaches used. Indeed, this problem is deeply rooted in the static nature of the on-chip inter-communication among tasks, hampering the flexibility of runtime task relocation for reliability. This thesis proposes the enabling frameworks for reliable, available, real-time, efficient, secure, and high-performance reconfigurable computing by providing techniques and mechanisms for reliable runtime reconfiguration, and dynamic inter-circuit communication and synchronization for circuits on reconfigurable hardware. This work provides task configuration infrastructures for reliable reconfigurable computing. Key features, especially reliability-enabling functionalities, which have been given little or no attention in state-of-the-art are implemented. These features include internal register read and write for device diagnosis; configuration operation abort mechanism, and tightly integrated selective-area scanning, which aims to optimize access to the device’s reconfiguration port for both task loading and error mitigation. In addition, this thesis proposes a novel reliability-aware inter-task communication framework that exploits the availability of dedicated clocking infrastructures in a typical FPGA to provide inter-task communication and synchronization. The clock buffers and networks of an FPGA use dedicated routing resources, which are distinct from the general routing resources. As such, deploying these dedicated resources for communication sidesteps the restriction of static routes and allows a better relocation of circuits for reliability purposes. For evaluation, a case study that uses a NASA/JPL spectrometer data processing application is employed to demonstrate the improved reliability brought about by the implemented configuration controller and the reliability-aware dynamic communication infrastructure. It is observed that up to 74% time saving can be achieved for selective-area error mitigation when compared to state-of-the-art vendor implementations. Moreover, an improvement in overall system reliability is observed when the proposed dynamic communication scheme is deployed in the data processing application. Finally, one area of reconfigurable computing that has received insufficient attention is security. Meanwhile, considering the nature of applications which now turn to reconfigurable computing for accelerating compute-intensive processes, a high premium is now placed on security, not only of the device but also of the applications, from loading to runtime execution. To address security concerns, a novel secure and efficient task configuration technique for task relocation is also investigated, providing configuration time savings of up to 32% or 83%, depending on the device; and resource usage savings in excess of 90% compared to state-of-the-art
    corecore