1,039 research outputs found

    Experimental analysis of computer system dependability

    Get PDF
    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance

    병렬 및 λΆ„μ‚° μž„λ² λ””λ“œ μ‹œμŠ€ν…œμ„ μœ„ν•œ λͺ¨λΈ 기반 μ½”λ“œ 생성 ν”„λ ˆμž„μ›Œν¬

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(박사)--μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› :κ³΅κ³ΌλŒ€ν•™ 컴퓨터곡학뢀,2020. 2. ν•˜μˆœνšŒ.μ†Œν”„νŠΈμ›¨μ–΄ 섀계 생산성 및 μœ μ§€λ³΄μˆ˜μ„±μ„ ν–₯μƒμ‹œν‚€κΈ° μœ„ν•΄ λ‹€μ–‘ν•œ μ†Œν”„νŠΈμ›¨μ–΄ 개발 방법둠이 μ œμ•ˆλ˜μ—ˆμ§€λ§Œ, λŒ€λΆ€λΆ„μ˜ μ—°κ΅¬λŠ” μ‘μš© μ†Œν”„νŠΈμ›¨μ–΄λ₯Ό ν•˜λ‚˜μ˜ ν”„λ‘œμ„Έμ„œμ—μ„œ λ™μž‘μ‹œν‚€λŠ” 데에 μ΄ˆμ μ„ λ§žμΆ”κ³  μžˆλ‹€. λ˜ν•œ, μž„λ² λ””λ“œ μ‹œμŠ€ν…œμ„ κ°œλ°œν•˜λŠ” 데에 ν•„μš”ν•œ μ§€μ—°μ΄λ‚˜ μžμ› μš”κ΅¬ 사항에 λŒ€ν•œ λΉ„κΈ°λŠ₯적 μš”κ΅¬ 사항을 κ³ λ €ν•˜μ§€ μ•Šκ³  있기 λ•Œλ¬Έμ— 일반적인 μ†Œν”„νŠΈμ›¨μ–΄ 개발 방법둠을 μž„λ² λ””λ“œ μ†Œν”„νŠΈμ›¨μ–΄λ₯Ό κ°œλ°œν•˜λŠ” 데에 μ μš©ν•˜λŠ” 것은 μ ν•©ν•˜μ§€ μ•Šλ‹€. 이 λ…Όλ¬Έμ—μ„œλŠ” 병렬 및 λΆ„μ‚° μž„λ² λ””λ“œ μ‹œμŠ€ν…œμ„ λŒ€μƒμœΌλ‘œ ν•˜λŠ” μ†Œν”„νŠΈμ›¨μ–΄λ₯Ό λͺ¨λΈλ‘œ ν‘œν˜„ν•˜κ³ , 이λ₯Ό μ†Œν”„νŠΈμ›¨μ–΄ λΆ„μ„μ΄λ‚˜ κ°œλ°œμ— ν™œμš©ν•˜λŠ” 개발 방법둠을 μ†Œκ°œν•œλ‹€. 우리의 λͺ¨λΈμ—μ„œ μ‘μš© μ†Œν”„νŠΈμ›¨μ–΄λŠ” κ³„μΈ΅μ μœΌλ‘œ ν‘œν˜„ν•  수 μžˆλŠ” μ—¬λŸ¬ 개의 νƒœμŠ€ν¬λ‘œ 이루어져 있으며, ν•˜λ“œμ›¨μ–΄ ν”Œλž«νΌκ³Ό λ…λ¦½μ μœΌλ‘œ λͺ…μ„Έν•œλ‹€. νƒœμŠ€ν¬ κ°„μ˜ 톡신 및 λ™κΈ°ν™”λŠ” λͺ¨λΈμ΄ μ •μ˜ν•œ κ·œμ•½μ΄ μ •ν•΄μ Έ 있고, μ΄λŸ¬ν•œ κ·œμ•½μ„ 톡해 μ‹€μ œ ν”„λ‘œκ·Έλž¨μ„ μ‹€ν–‰ν•˜κΈ° 전에 μ†Œν”„νŠΈμ›¨μ–΄ μ—λŸ¬λ₯Ό 정적 뢄석을 톡해 확인할 수 있고, μ΄λŠ” μ‘μš©μ˜ 검증 λ³΅μž‘λ„λ₯Ό μ€„μ΄λŠ” 데에 κΈ°μ—¬ν•œλ‹€. μ§€μ •ν•œ ν•˜λ“œμ›¨μ–΄ ν”Œλž«νΌμ—μ„œ λ™μž‘ν•˜λŠ” ν”„λ‘œκ·Έλž¨μ€ νƒœμŠ€ν¬λ“€μ„ ν”„λ‘œμ„Έμ„œμ— λ§€ν•‘ν•œ 이후에 μžλ™μ μœΌλ‘œ ν•©μ„±ν•  수 μžˆλ‹€. μœ„μ˜ λͺ¨λΈ 기반 μ†Œν”„νŠΈμ›¨μ–΄ 개발 λ°©λ²•λ‘ μ—μ„œ μ‚¬μš©ν•˜λŠ” ν”„λ‘œκ·Έλž¨ ν•©μ„±κΈ°λ₯Ό λ³Έ λ…Όλ¬Έμ—μ„œ μ œμ•ˆν•˜μ˜€λŠ”λ°, λͺ…μ„Έν•œ ν”Œλž«νΌ μš”κ΅¬ 사항을 λ°”νƒ•μœΌλ‘œ 병렬 및 λΆ„μ‚° μž„λ² λ””λ“œ μ‹œμŠ€ν…œμ„μ—μ„œ λ™μž‘ν•˜λŠ” μ½”λ“œλ₯Ό μƒμ„±ν•œλ‹€. μ—¬λŸ¬ 개의 μ •ν˜•μ  λͺ¨λΈλ“€μ„ κ³„μΈ΅μ μœΌλ‘œ ν‘œν˜„ν•˜μ—¬ μ‘μš©μ˜ 동적 ν–‰νƒœλ₯Ό λ‚˜νƒ€κ³ , ν•©μ„±κΈ°λŠ” μ—¬λŸ¬ λͺ¨λΈλ‘œ κ΅¬μ„±λœ 계측적인 λͺ¨λΈλ‘œλΆ€ν„° 병렬성을 κ³ λ €ν•˜μ—¬ νƒœμŠ€ν¬λ₯Ό μ‹€ν–‰ν•  수 μžˆλ‹€. λ˜ν•œ, ν”„λ‘œκ·Έλž¨ ν•©μ„±κΈ°μ—μ„œ λ‹€μ–‘ν•œ ν”Œλž«νΌμ΄λ‚˜ λ„€νŠΈμ›Œν¬λ₯Ό 지원할 수 μžˆλ„λ‘ μ½”λ“œλ₯Ό κ΄€λ¦¬ν•˜λŠ” 방법도 보여주고 μžˆλ‹€. λ³Έ λ…Όλ¬Έμ—μ„œ μ œμ‹œν•˜λŠ” μ†Œν”„νŠΈμ›¨μ–΄ 개발 방법둠은 6개의 ν•˜λ“œμ›¨μ–΄ ν”Œλž«νΌκ³Ό 3 μ’…λ₯˜μ˜ λ„€νŠΈμ›Œν¬λ‘œ κ΅¬μ„±λ˜μ–΄ μžˆλŠ” μ‹€μ œ κ°μ‹œ μ†Œν”„νŠΈμ›¨μ–΄ μ‹œμŠ€ν…œ μ‘μš© μ˜ˆμ œμ™€ 이쒅 λ©€ν‹° ν”„λ‘œμ„Έμ„œλ₯Ό ν™œμš©ν•˜λŠ” 원격 λ”₯ λŸ¬λ‹ 예제λ₯Ό μˆ˜ν–‰ν•˜μ—¬ 개발 λ°©λ²•λ‘ μ˜ 적용 κ°€λŠ₯성을 μ‹œν—˜ν•˜μ˜€λ‹€. λ˜ν•œ, ν”„λ‘œκ·Έλž¨ ν•©μ„±κΈ°κ°€ μƒˆλ‘œμš΄ ν”Œλž«νΌμ΄λ‚˜ λ„€νŠΈμ›Œν¬λ₯Ό μ§€μ›ν•˜κΈ° μœ„ν•΄ ν•„μš”λ‘œ ν•˜λŠ” 개발 λΉ„μš©λ„ μ‹€μ œ μΈ‘μ • 및 μ˜ˆμΈ‘ν•˜μ—¬ μƒλŒ€μ μœΌλ‘œ 적은 λ…Έλ ₯으둜 μƒˆλ‘œμš΄ ν”Œλž«νΌμ„ 지원할 수 μžˆμŒμ„ ν™•μΈν•˜μ˜€λ‹€. λ§Žμ€ μž„λ² λ””λ“œ μ‹œμŠ€ν…œμ—μ„œ μ˜ˆμƒμΉ˜ λͺ»ν•œ ν•˜λ“œμ›¨μ–΄ μ—λŸ¬μ— λŒ€ν•΄ 결함을 κ°λ‚΄ν•˜λŠ” 것을 ν•„μš”λ‘œ ν•˜κΈ° λ•Œλ¬Έμ— 결함 감내에 λŒ€ν•œ μ½”λ“œλ₯Ό μžλ™μœΌλ‘œ μƒμ„±ν•˜λŠ” 연ꡬ도 μ§„ν–‰ν•˜μ˜€λ‹€. λ³Έ κΈ°λ²•μ—μ„œ 결함 감내 섀정에 따라 νƒœμŠ€ν¬ κ·Έλž˜ν”„λ₯Ό μˆ˜μ •ν•˜λŠ” 방식을 ν™œμš©ν•˜μ˜€μœΌλ©°, 결함 κ°λ‚΄μ˜ λΉ„κΈ°λŠ₯적 μš”κ΅¬ 사항을 μ‘μš© κ°œλ°œμžκ°€ μ‰½κ²Œ μ μš©ν•  수 μžˆλ„λ‘ ν•˜μ˜€λ‹€. λ˜ν•œ, 결함 감내 μ§€μ›ν•˜λŠ” 것과 κ΄€λ ¨ν•˜μ—¬ μ‹€μ œ μˆ˜λ™μœΌλ‘œ κ΅¬ν˜„ν–ˆμ„ κ²½μš°μ™€ λΉ„κ΅ν•˜μ˜€κ³ , 결함 μ£Όμž… 도ꡬλ₯Ό μ΄μš©ν•˜μ—¬ 결함 λ°œμƒ μ‹œλ‚˜λ¦¬μ˜€λ₯Ό μž¬ν˜„ν•˜κ±°λ‚˜, μž„μ˜λ‘œ 결함을 μ£Όμž…ν•˜λŠ” μ‹€ν—˜μ„ μˆ˜ν–‰ν•˜μ˜€λ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ 결함 감내λ₯Ό μ‹€ν—˜ν•  λ•Œμ— ν™œμš©ν•œ 결함 μ£Όμž… λ„κ΅¬λŠ” λ³Έ λ…Όλ¬Έμ˜ 또 λ‹€λ₯Έ κΈ°μ—¬ 사항 쀑 ν•˜λ‚˜λ‘œ λ¦¬λˆ…μŠ€ ν™˜κ²½μœΌλ‘œ λŒ€μƒμœΌλ‘œ μ‘μš© μ˜μ—­ 및 컀널 μ˜μ—­μ— 결함을 μ£Όμž…ν•˜λŠ” 도ꡬλ₯Ό κ°œλ°œν•˜μ˜€λ‹€. μ‹œμŠ€ν…œμ˜ 견고성을 κ²€μ¦ν•˜κΈ° μœ„ν•΄ 결함을 μ£Όμž…ν•˜μ—¬ 결함 μ‹œλ‚˜λ¦¬μ˜€λ₯Ό μž¬ν˜„ν•˜λŠ” 것은 널리 μ‚¬μš©λ˜λŠ” λ°©λ²•μœΌλ‘œ, λ³Έ λ…Όλ¬Έμ—μ„œ 개발된 결함 μ£Όμž… λ„κ΅¬λŠ” μ‹œμŠ€ν…œμ΄ λ™μž‘ν•˜λŠ” 도쀑에 μž¬ν˜„ κ°€λŠ₯ν•œ 결함을 μ£Όμž…ν•  수 μžˆλŠ” 도ꡬ이닀. 컀널 μ˜μ—­μ—μ„œμ˜ 결함 μ£Όμž…μ„ μœ„ν•΄ 두 μ’…λ₯˜μ˜ 결함 μ£Όμž… 방법을 μ œκ³΅ν•˜λ©°, ν•˜λ‚˜λŠ” 컀널 GNU 디버거λ₯Ό μ΄μš©ν•œ 방법이고, λ‹€λ₯Έ ν•˜λ‚˜λŠ” ARM ν•˜λ“œμ›¨μ–΄ 브레이크포인트λ₯Ό ν™œμš©ν•œ 방법이닀. μ‘μš© μ˜μ—­μ—μ„œ 결함을 μ£Όμž…ν•˜κΈ° μœ„ν•΄ GDB 기반 결함 μ£Όμž… 방법을 μ΄μš©ν•˜μ—¬ 동일 μ‹œμŠ€ν…œ ν˜Ήμ€ 원격 μ‹œμŠ€ν…œμ˜ μ‘μš©μ— 결함을 μ£Όμž…ν•  수 μžˆλ‹€. 결함 μ£Όμž… 도ꡬ에 λŒ€ν•œ μ‹€ν—˜μ€ ODROID-XU4 λ³΄λ“œμ—μ„œ μ§„ν–‰ν•˜μ˜€λ‹€.While various software development methodologies have been proposed to increase the design productivity and maintainability of software, they usually focus on the development of application software running on a single processing element, without concern about the non-functional requirements of an embedded system such as latency and resource requirements. In this thesis, we present a model-based software development method for parallel and distributed embedded systems. An application is specified as a set of tasks that follow a set of given rules for communication and synchronization in a hierarchical fashion, independently of the hardware platform. Having such rules enables us to perform static analysis to check some software errors at compile time to reduce the verification difficulty. Platform-specific program is synthesized automatically after mapping of tasks onto processing elements is determined. The program synthesizer is also proposed to generate codes which satisfies platform requirements for parallel and distributed embedded systems. As multiple models which can express dynamic behaviors can be depicted hierarchically, the synthesizer supports to manage multiple task graphs with a different hierarchy to run tasks with parallelism. Also, the synthesizer shows methods of managing codes for heterogeneous platforms and generating various communication methods. The viability of the proposed software development method is verified with a real-life surveillance application that runs on six processing elements with three remote communication methods, and remote deep learning example is conducted to use heterogeneous multiprocessing components on distributed systems. Also, supporting a new platform and network requires a small effort by measuring and estimating development costs. Since tolerance to unexpected errors is a required feature of many embedded systems, we also support an automatic fault-tolerant code generation. Fault tolerance can be applied by modifying the task graph based on the selected fault tolerance configurations, so the non-functional requirement of fault tolerance can be easily adopted by an application developer. To compare the effort of supporting fault tolerance, manual implementation of fault tolerance is performed. Also, the fault tolerance method is tested with the fault injection tool to emulate fault scenarios and inject faults randomly. Our fault injection tool, which has used for testing our fault-tolerance method, is another work of this thesis. Emulating fault scenarios by intentionally injecting faults is commonly used to test and verify the robustness of a system. To emulate faults on an embedded system, we present a run-time fault injection framework that can inject a fault on both a kernel and application layer of Linux-based systems. For injecting faults on a kernel layer, two complementary fault injection techniques are used. One is based on Kernel GNU Debugger, and the other is using a hardware breakpoint supported by the ARM architecture. For application-level fault injection, the GDB-based fault injection method is used to inject a fault on a remote application. The viability of the proposed fault injection tool is proved by real-life experiments with an ODROID-XU4 system.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Contribution 6 1.3 Dissertation Organization 8 Chapter 2 Background 9 2.1 HOPES: Hope of Parallel Embedded Software 9 2.1.1 Software Development Procedure 9 2.1.2 Components of HOPES 12 2.2 Universal Execution Model 13 2.2.1 Task Graph Specification 13 2.2.2 Dataflow specification of an Application 15 2.2.3 Task Code Specification and Generic APIs 21 2.2.4 Meta-data Specification 23 Chapter 3 Program Synthesis for Parallel and Distributed Embedded Systems 24 3.1 Motivational Example 24 3.2 Program Synthesis Overview 26 3.3 Program Synthesis from Hierarchically-mixed Models 30 3.4 Platform Code Synthesis 33 3.5 Communication Code Synthesis 36 3.6 Experiments 40 3.6.1 Development Cost of Supporting New Platforms and Networks 40 3.6.2 Program Synthesis for the Surveillance System Example 44 3.6.3 Remote GPU-accelerated Deep Learning Example 46 3.7 Document Generation 48 3.8 Related Works 49 Chapter 4 Model Transformation for Fault-tolerant Code Synthesis 56 4.1 Fault-tolerant Code Synthesis Techniques 56 4.2 Applying Fault Tolerance Techniques in HOPES 61 4.3 Experiments 62 4.3.1 Development Cost of Applying Fault Tolerance 62 4.3.2 Fault Tolerance Experiments 62 4.4 Random Fault Injection Experiments 65 4.5 Related Works 68 Chapter 5 Fault Injection Framework for Linux-based Embedded Systems 70 5.1 Background 70 5.1.1 Fault Injection Techniques 70 5.1.2 Kernel GNU Debugger 71 5.1.3 ARM Hardware Breakpoint 72 5.2 Fault Injection Framework 74 5.2.1 Overview 74 5.2.2 Architecture 75 5.2.3 Fault Injection Techniques 79 5.2.4 Implementation 83 5.3 Experiments 90 5.3.1 Experiment Setup 90 5.3.2 Performance Comparison of Two Fault Injection Methods 90 5.3.3 Bit-flip Fault Experiments 92 5.3.4 eMMC Controller Fault Experiments 94 Chapter 6 Conclusion 97 Bibliography 99 μš” μ•½ 108Docto

    Establishment of a novel predictive reliability assessment strategy for ship machinery

    Get PDF
    There is no doubt that recent years, maritime industry is moving forward to novel and sophisticated inspection and maintenance practices. Nowadays maintenance is encountered as an operational method, which can be employed both as a profit generating process and a cost reduction budget centre through an enhanced Operation and Maintenance (O&M) strategy. In the first place, a flexible framework to be applicable on complex system level of machinery can be introduced towards ship maintenance scheduling of systems, subsystems and components.;This holistic inspection and maintenance notion should be implemented by integrating different strategies, methodologies, technologies and tools, suitably selected by fulfilling the requirements of the selected ship systems. In this thesis, an innovative maintenance strategy for ship machinery is proposed, namely the Probabilistic Machinery Reliability Assessment (PMRA) strategy focusing towards the reliability and safety enhancement of main systems, subsystems and maintainable units and components.;In this respect, the combination of a data mining method (k-means), the manufacturer safety aspects, the dynamic state modelling (Markov Chains), the probabilistic predictive reliability assessment (Bayesian Belief Networks) and the qualitative decision making (Failure Modes and Effects Analysis) is employed encompassing the benefits of qualitative and quantitative reliability assessment. PMRA has been clearly demonstrated in two case studies applied on offshore platform oil and gas and selected ship machinery.;The results are used to identify the most unreliability systems, subsystems and components, while advising suitable practical inspection and maintenance activities. The proposed PMRA strategy is also tested in a flexible sensitivity analysis scheme.There is no doubt that recent years, maritime industry is moving forward to novel and sophisticated inspection and maintenance practices. Nowadays maintenance is encountered as an operational method, which can be employed both as a profit generating process and a cost reduction budget centre through an enhanced Operation and Maintenance (O&M) strategy. In the first place, a flexible framework to be applicable on complex system level of machinery can be introduced towards ship maintenance scheduling of systems, subsystems and components.;This holistic inspection and maintenance notion should be implemented by integrating different strategies, methodologies, technologies and tools, suitably selected by fulfilling the requirements of the selected ship systems. In this thesis, an innovative maintenance strategy for ship machinery is proposed, namely the Probabilistic Machinery Reliability Assessment (PMRA) strategy focusing towards the reliability and safety enhancement of main systems, subsystems and maintainable units and components.;In this respect, the combination of a data mining method (k-means), the manufacturer safety aspects, the dynamic state modelling (Markov Chains), the probabilistic predictive reliability assessment (Bayesian Belief Networks) and the qualitative decision making (Failure Modes and Effects Analysis) is employed encompassing the benefits of qualitative and quantitative reliability assessment. PMRA has been clearly demonstrated in two case studies applied on offshore platform oil and gas and selected ship machinery.;The results are used to identify the most unreliability systems, subsystems and components, while advising suitable practical inspection and maintenance activities. The proposed PMRA strategy is also tested in a flexible sensitivity analysis scheme

    Integration of well data into dynamic reservoir interpretation using multiple seismic surveys

    Get PDF
    This thesis develops and tests a new technique which integrates information from well production and 4D seismic data directly in the data domain. This method is of value when seismic data are acquired by multiple surveys over the same area of a hydrocarbon reservoir. Sequences of 4D seismic changes can then be extracted over different time intervals from multiply repeated seismic surveys and these are cross correlated with identical time sequences of cumulative fluid volumes produced or injected from wells. The technique is applied to frequently repeated seismic surveys from three North Sea fields, including two compartmentalised reservoirs: the Schiehallion and Norne field, and a compacting reservoir: the Valhall field. Maps of well to seismic cross-correlations are proven to produce a strong, localised and stable signal in the connected neighbourhood of individual wells. The correlation signatures from the Schiehallion and Norne application investigated in this thesis are the consequence of pressure performance due to reservoir compartmentalisation. In the Schiehallion study, the mapped results help identify the production signal related only to individual wells, thus leading to a better delineation of reservoir compartments. In the Norne study in particular, an extra reservoir volume connected to the original segment is highlighted by the technique. The reservoir simulation model is subsequently updated and a better match between the observed and simulated data can be achieved. The application to the compacting Valhall field involves using data from the Life of Field Seismic project, for which the 4D signature is dominated by compaction-assisted pressure depletion. For these data, both AI and time-shift attributes are found to have a remarkably consistent correlation with the well activity for selected groups of wells. Further, maps of these results possess sufficient fine scale detail to resolve and disentangle interfering seismic responses generated by closely spaced wells and localised zones of gas breakout along long horizontal producers. These case studies indicate our proposed methodology of uniting well data and 4D seismic and confirm that this does indeed provide an insightful product for dynamic interpretation of the producing reservoir

    1992 NASA/ASEE Summer Faculty Fellowship Program

    Get PDF
    For the 28th consecutive year, a NASA/ASEE Summer Faculty Fellowship Program was conducted at the Marshall Space Flight Center (MSFC). The program was conducted by the University of Alabama and MSFC during the period June 1, 1992 through August 7, 1992. Operated under the auspices of the American Society for Engineering Education, the MSFC program, was well as those at other centers, was sponsored by the Office of Educational Affairs, NASA Headquarters, Washington, DC. The basic objectives of the programs, which are the 29th year of operation nationally, are (1) to further the professional knowledge of qualified engineering and science faculty members; (2) to stimulate and exchange ideas between participants and NASA; (3) to enrich and refresh the research and teaching activities of the participants' institutions; and (4) to contribute to the research objectives of the NASA centers

    Loki: A State-Driven Fault Injector for Distributed Systems

    Get PDF
    Coordinated Science Laboratory was formerly known as Control Systems LaboratoryDefense Advanced Research Projects Agency Information Technology Office (DARPA/ITO) / F30602-96-C-0315, F30602-97-C-0276, and F30602-98-C-0187U of I OnlyRestricted to UIUC communit

    Cross layer reliability estimation for digital systems

    Get PDF
    Forthcoming manufacturing technologies hold the promise to increase multifuctional computing systems performance and functionality thanks to a remarkable growth of the device integration density. Despite the benefits introduced by this technology improvements, reliability is becoming a key challenge for the semiconductor industry. With transistor size reaching the atomic dimensions, vulnerability to unavoidable fluctuations in the manufacturing process and environmental stress rise dramatically. Failing to meet a reliability requirement may add excessive re-design cost to recover and may have severe consequences on the success of a product. %Worst-case design with large margins to guarantee reliable operation has been employed for long time. However, it is reaching a limit that makes it economically unsustainable due to its performance, area, and power cost. One of the open challenges for future technologies is building ``dependable'' systems on top of unreliable components, which will degrade and even fail during normal lifetime of the chip. Conventional design techniques are highly inefficient. They expend significant amount of energy to tolerate the device unpredictability by adding safety margins to a circuit's operating voltage, clock frequency or charge stored per bit. Unfortunately, the additional cost introduced to compensate unreliability are rapidly becoming unacceptable in today's environment where power consumption is often the limiting factor for integrated circuit performance, and energy efficiency is a top concern. Attention should be payed to tailor techniques to improve the reliability of a system on the basis of its requirements, ending up with cost-effective solutions favoring the success of the product on the market. Cross-layer reliability is one of the most promising approaches to achieve this goal. Cross-layer reliability techniques take into account the interactions between the layers composing a complex system (i.e., technology, hardware and software layers) to implement efficient cross-layer fault mitigation mechanisms. Fault tolerance mechanism are carefully implemented at different layers starting from the technology up to the software layer to carefully optimize the system by exploiting the inner capability of each layer to mask lower level faults. For this purpose, cross-layer reliability design techniques need to be complemented with cross-layer reliability evaluation tools, able to precisely assess the reliability level of a selected design early in the design cycle. Accurate and early reliability estimates would enable the exploration of the system design space and the optimization of multiple constraints such as performance, power consumption, cost and reliability. This Ph.D. thesis is devoted to the development of new methodologies and tools to evaluate and optimize the reliability of complex digital systems during the early design stages. More specifically, techniques addressing hardware accelerators (i.e., FPGAs and GPUs), microprocessors and full systems are discussed. All developed methodologies are presented in conjunction with their application to real-world use cases belonging to different computational domains

    Propulsion Control Technology Development Needs to Address NASA Aeronautics Research Mission Goals for Thrusts 3a and 4

    Get PDF
    The Commercial Aero-Propulsion Control Working Group (CAPCWG), consisting of propulsion control technology leads from The Boeing Company, GE Aviation, Honeywell, Pratt & Whitney, Rolls-Royce, and NASA (National Aeronautics and Space Administration) Glenn Research Center, has been working together over the past year to identify propulsion control technology areas of common interest that we believe are critical to achieving the challenging NASA Aeronautics Research goals for Thrust 3a: Ultra-Efficient Commercial Vehicles - Subsonic Transports, and Thrust 4: Transition to Alternative Propulsion and Energy. This paper describes the various propulsion control technology development areas identified by CAPCWG as most critical for NASA to invest in. For Thrust 3a these are: i) Integrated On-Board Model Based Engine Control and Health Management; ii) Flexible and Modular Networked Control Hardware and Software Architecture; iii) Intelligent Air/Fuel Control for Low Emissions Combustion; and iv) Active Clearance Control. For Thrust 4a, the focus is on Hybrid Electric Propulsion (HEP) for single aisle commercial aircraft. The specific technology development areas include: i) Integrated Power and Propulsion System Dynamic Modeling for Control; ii) Control Architectures for HEP; iii) HEP Control Verification and Validation; and iv) Engine/Airplane Control Integration. For each of the technology areas, the discussion includes: problem to be solved and how it relates to NASA goals, and the challenges to be addressed in reducing risk
    • …
    corecore