8 research outputs found

    μ‹€μ‹œκ°„ μž„λ² λ””λ“œ μ‹œμŠ€ν…œμ„ μœ„ν•œ 동적 ν–‰μœ„ λͺ…μ„Έ 및 섀계 곡간 탐색 기법

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사)-- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : 전기·컴퓨터곡학뢀, 2016. 8. ν•˜μˆœνšŒ.ν•˜λ‚˜μ˜ 칩에 μ§‘μ λ˜λŠ” ν”„λ‘œμ„Έμ„œμ˜ κ°œμˆ˜κ°€ λ§Žμ•„μ§€κ³ , λ§Žμ€ κΈ°λŠ₯듀이 톡합됨에 따라, μ—°μ‚°μ–‘μ˜ λ³€ν™”, μ„œλΉ„μŠ€μ˜ ν’ˆμ§ˆ, μ˜ˆμƒμΉ˜ λͺ»ν•œ μ‹œμŠ€ν…œ μš”μ†Œμ˜ κ³ μž₯ λ“±κ³Ό 같은 λ‹€μ–‘ν•œ μš”μ†Œλ“€μ— μ˜ν•΄ μ‹œμŠ€ν…œμ˜ μƒνƒœκ°€ λ™μ μœΌλ‘œ λ³€ν™”ν•˜κ²Œ λœλ‹€. λ°˜λ©΄μ—, λ³Έ λ…Όλ¬Έμ—μ„œ 주된 관심사λ₯Ό κ°€μ§€λŠ” 슀마트 폰 μž₯μΉ˜μ—μ„œ 주둜 μ‚¬μš©λ˜λŠ” λΉ„λ””μ˜€, κ·Έλž˜ν”½ μ‘μš©λ“€μ˜ 경우, 계산 λ³΅μž‘λ„κ°€ μ§€μ†μ μœΌλ‘œ μ¦κ°€ν•˜κ³  μžˆλ‹€. λ”°λΌμ„œ, μ΄λ ‡κ²Œ λ™μ μœΌλ‘œ λ³€ν•˜λŠ” ν–‰μœ„λ₯Ό κ°€μ§€λ©΄μ„œλ„ 병렬성을 λ‚΄μ œν•œ 계산 집약적인 연산을 ν¬ν•¨ν•˜λŠ” λ³΅μž‘ν•œ μ‹œμŠ€ν…œμ„ κ΅¬ν˜„ν•˜κΈ° μœ„ν•΄μ„œλŠ” 체계적인 섀계 방법둠이 κ³ λ„λ‘œ μš”κ΅¬λœλ‹€. λͺ¨λΈ 기반 방법둠은 병렬 μž„λ² λ””λ“œ μ†Œν”„νŠΈμ›¨μ–΄ κ°œλ°œμ„ μœ„ν•œ λŒ€ν‘œμ μΈ 방법 쀑 ν•˜λ‚˜μ΄λ‹€. 특히, μ‹œμŠ€ν…œ λͺ…μ„Έ, 정적 μ„±λŠ₯ 뢄석, 섀계 곡간 탐색, 그리고 μžλ™ μ½”λ“œ μƒμ„±κΉŒμ§€μ˜ λͺ¨λ“  섀계 단계λ₯Ό μ§€μ›ν•˜λŠ” 병렬 μž„λ² λ””λ“œ μ†Œν”„νŠΈμ›¨μ–΄ 섀계 ν™˜κ²½μœΌλ‘œμ„œ, HOPES ν”„λ ˆμž„μ›Œν¬κ°€ μ œμ‹œλ˜μ—ˆλ‹€. λ‹€λ₯Έ 섀계 ν™˜κ²½λ“€κ³ΌλŠ” λ‹€λ₯΄κ²Œ, 이기쒅 λ©€ν‹°ν”„λ‘œμ„Έμ„œ μ•„ν‚€ν…μ²˜μ—μ„œμ˜ 일반적인 μˆ˜ν–‰ λͺ¨λΈλ‘œμ„œ, 곡톡 쀑간 μ½”λ“œ (CIC) 라고 λΆ€λ₯΄λŠ” ν”„λ‘œκ·Έλž˜λ° ν”Œλž«νΌμ΄λΌλŠ” μƒˆλ‘œμš΄ κ°œλ…μ„ μ†Œκ°œν•˜μ˜€λ‹€. CIC νƒœμŠ€ν¬ λͺ¨λΈμ€ ν”„λ‘œμ„ΈμŠ€ λ„€νŠΈμ›Œν¬ λͺ¨λΈμ— κΈ°λ°˜ν•˜κ³  μžˆμ§€λ§Œ, SDF λͺ¨λΈλ‘œ ꡬ체화될 수 있기 λ•Œλ¬Έμ—, 병렬 처리뿐만 μ•„λ‹ˆλΌ 정적 뢄석이 μš©μ΄ν•˜λ‹€λŠ” μž₯점을 가진닀. ν•˜μ§€λ§Œ, SDF λͺ¨λΈμ€ μ‘μš©μ˜ 동적인 ν–‰μœ„λ₯Ό λͺ…μ„Έν•  수 μ—†λ‹€λŠ” ν‘œν˜„μƒμ˜ μ œμ•½μ„ 가진닀. μ΄λŸ¬ν•œ μ œμ•½μ„ κ·Ήλ³΅ν•˜κ³ , μ‹œμŠ€ν…œμ˜ 동적 ν–‰μœ„λ₯Ό μ‘μš© 외뢀와 λ‚΄λΆ€λ‘œ κ΅¬λΆ„ν•˜μ—¬ λͺ…μ„Έν•˜κΈ° μœ„ν•΄, λ³Έ λ…Όλ¬Έμ—μ„œλŠ” 데이터 ν”Œλ‘œμš°μ™€ μœ ν•œμƒνƒœκΈ° (FSM) λͺ¨λΈμ— κΈ°λ°˜ν•˜μ—¬ ν™•μž₯된 CIC νƒœμŠ€ν¬ λͺ¨λΈμ„ μ œμ•ˆν•œλ‹€. μƒμœ„ μˆ˜μ€€μ—μ„œλŠ”, 각 μ‘μš©μ€ 데이터 ν”Œλ‘œμš° νƒœμŠ€ν¬λ‘œ λͺ…μ„Έ 되며, 동적 ν–‰μœ„λŠ” μ‘μš©λ“€μ˜ μˆ˜ν–‰μ„ κ°λ…ν•˜λŠ” μ œμ–΄ νƒœμŠ€ν¬λ‘œ λͺ¨λΈ λœλ‹€. 데이터 ν”Œλ‘œμš° νƒœμŠ€ν¬ λ‚΄λΆ€λŠ”, μœ ν•œμƒνƒœκΈ° 기반의 SADF λͺ¨λΈκ³Ό μœ μ‚¬ν•œ ν˜•νƒœλ‘œ 동적 ν–‰μœ„κ°€ λͺ…μ„Έ λœλ‹€SDF νƒœμŠ€ν¬λŠ” 볡수개의 ν–‰μœ„λ₯Ό κ°€μ§ˆ 수 있으며, λͺ¨λ“œ μ „ν™˜κΈ° (MTM)이라고 λΆˆλ¦¬λŠ” μœ ν•œ μƒνƒœκΈ°μ˜ ν…Œμ΄λΈ” ν˜•νƒœμ˜ λͺ…μ„Έλ₯Ό 톡해 SDF κ·Έλž˜ν”„μ˜ λͺ¨λ“œ μ „ν™˜ κ·œμΉ™μ„ λͺ…μ„Έ ν•œλ‹€. 이λ₯Ό MTM-SDF κ·Έλž˜ν”„λΌκ³  λΆ€λ₯΄λ©°, 볡수 λͺ¨λ“œ 데이터 ν”Œλ‘œμš° λͺ¨λΈ 쀑 ν•˜λ‚˜λΌ κ΅¬λΆ„λœλ‹€. μ‘μš©μ€ μœ ν•œν•œ ν–‰μœ„ (λ˜λŠ” λͺ¨λ“œ)λ₯Ό 가지며, 각 ν–‰μœ„ (λͺ¨λ“œ)λŠ” SDF κ·Έλž˜ν”„λ‘œ ν‘œν˜„λ˜λŠ” 것을 κ°€μ •ν•œλ‹€. 이λ₯Ό 톡해 λ‹€μ–‘ν•œ ν”„λ‘œμ„Έμ„œ κ°œμˆ˜μ— λŒ€ν•΄ λ‹¨μœ„μ‹œκ°„λ‹Ή μ²˜λ¦¬λŸ‰μ„ μ΅œλŒ€ν™”ν•˜λŠ” 컴파일-μ‹œκ°„ μŠ€μΌ€μ€„λ§μ„ μˆ˜ν–‰ν•˜κ³ , μŠ€μΌ€μ€„ κ²°κ³Όλ₯Ό μ €μž₯ν•  수 μžˆλ„λ‘ ν•œλ‹€. λ˜ν•œ, 볡수 λͺ¨λ“œ 데이터 ν”Œλ‘œμš° κ·Έλž˜ν”„λ₯Ό μœ„ν•œ λ©€ν‹°ν”„λ‘œμ„Έμ„œ μŠ€μΌ€μ€„λ§ 기법을 μ œμ‹œν•œλ‹€. 볡수 λͺ¨λ“œ 데이터 ν”Œλ‘œμš° κ·Έλž˜ν”„λ₯Ό μœ„ν•œ λͺ‡λͺ‡ μŠ€μΌ€μ€„λ§ 기법듀이 μ‘΄μž¬ν•˜μ§€λ§Œ, λͺ¨λ“œ 사이에 νƒœμŠ€ν¬ 이주λ₯Ό ν—ˆμš©ν•œ 기법듀은 μ‘΄μž¬ν•˜μ§€ μ•ŠλŠ”λ‹€. ν•˜μ§€λ§Œ νƒœμŠ€ν¬ 이주λ₯Ό ν—ˆμš©ν•˜κ²Œ 되면 μžμ› μš”κ΅¬λŸ‰μ„ 쀄일 수 μžˆλ‹€λŠ” λ°œκ²¬μ„ 톡해, λ³Έ λ…Όλ¬Έμ—μ„œλŠ” λͺ¨λ“œ μ‚¬μ΄μ˜ νƒœμŠ€ν¬ 이주λ₯Ό ν—ˆμš©ν•˜λŠ” 볡수 λͺ¨λ“œ 데이터 ν”Œλ‘œμš° κ·Έλž˜ν”„λ₯Ό μœ„ν•œ λ©€ν‹°ν”„λ‘œμ„Έμ„œ μŠ€μΌ€μ€„λ§ 기법을 μ œμ•ˆν•œλ‹€. μœ μ „ μ•Œκ³ λ¦¬μ¦˜μ— κΈ°λ°˜ν•˜μ—¬, μ œμ•ˆν•˜λŠ” 기법은 μžμ› μš”κ΅¬λŸ‰μ„ μ΅œμ†Œν™”ν•˜κΈ° μœ„ν•΄ 각 λͺ¨λ“œμ— ν•΄λ‹Ήν•˜λŠ” λͺ¨λ“  SDF κ·Έλž˜ν”„λ₯Ό λ™μ‹œμ— μŠ€μΌ€μ€„ ν•œλ‹€. 주어진 λ‹¨μœ„ μ‹œκ°„λ‹Ή μ²˜λ¦¬λŸ‰ μ œμ•½μ„ λ§Œμ‘±μ‹œν‚€κΈ° μœ„ν•΄, μ œμ•ˆν•˜λŠ” 기법은 각 λͺ¨λ“œ λ³„λ‘œ μ‹€μ œ μ²˜λ¦¬λŸ‰ μš”κ΅¬λŸ‰μ„ κ³„μ‚°ν•˜λ©°, μ²˜λ¦¬λŸ‰μ˜ λΆˆκ·œμΉ™μ„±μ„ μ™„ν™”ν•˜κΈ° μœ„ν•œ 좜λ ₯ λ²„νΌμ˜ 크기λ₯Ό κ³„μ‚°ν•œλ‹€. λͺ…μ„Έλœ νƒœμŠ€ν¬ κ·Έλž˜ν”„μ™€ μŠ€μΌ€μ€„ κ²°κ³Όλ‘œλΆ€ν„°, HOPES ν”„λ ˆμž„μ›Œν¬λŠ” λŒ€μƒ μ•„ν‚€ν…μ²˜λ₯Ό μœ„ν•œ μžλ™ μ½”λ“œ 생성을 μ§€μ›ν•œλ‹€. 이λ₯Ό μœ„ν•΄ μžλ™ μ½”λ“œ μƒμ„±κΈ°λŠ” CIC νƒœμŠ€ν¬ λͺ¨λΈμ˜ ν™•μž₯된 νŠΉμ§•λ“€μ„ μ§€μ›ν•˜λ„λ‘ ν™•μž₯λ˜μ—ˆλ‹€. μ‘μš© μˆ˜μ€€μ—μ„œλŠ” MTM-SDF κ·Έλž˜ν”„λ₯Ό 주어진 정적 μŠ€μΌ€μ€„λ§ κ²°κ³Όλ₯Ό λ”°λ₯΄λŠ” λ©€ν‹°ν”„λ‘œμ„Έμ„œ μ½”λ“œλ₯Ό μƒμ„±ν•˜λ„λ‘ ν™•μž₯λ˜μ—ˆλ‹€. λ˜ν•œ, λ„€ 가지 μ„œλ‘œ λ‹€λ₯Έ μŠ€μΌ€μ€„λ§ μ •μ±… (fully-static, self-timed, static-assignment, fully-dynamic)에 λŒ€ν•œ λ©€ν‹°ν”„λ‘œμ„Έμ„œ μ½”λ“œ 생성을 μ§€μ›ν•œλ‹€. μ‹œμŠ€ν…œ μˆ˜μ€€μ—μ„œλŠ” μ§€μ›ν•˜λŠ” μ‹œμŠ€ν…œ μš”μ²­ API에 λŒ€ν•œ μ‹€μ œ κ΅¬ν˜„ μ½”λ“œλ₯Ό μƒμ„±ν•˜λ©°, 정적 μŠ€μΌ€μ€„ 결과와 νƒœμŠ€ν¬λ“€μ˜ μ œμ–΄ κ°€λŠ₯ν•œ 속성듀에 λŒ€ν•œ 자료 ꡬ쑰 μ½”λ“œλ₯Ό μƒμ„±ν•œλ‹€. 볡수 λͺ¨λ“œ λ©€ν‹°λ―Έλ””μ–΄ 터미널 예제λ₯Ό ν†΅ν•œ 기초적인 μ‹€ν—˜λ“€μ„ 톡해, μ œμ•ˆν•˜λŠ” λ°©λ²•λ‘ μ˜ 타당성을 보인닀.As the number of processors in a chip increases, and more functions are integrated, the system status will change dynamically due to various factors such as the workload variation, QoS requirement, and unexpected component failure. On the other hand, computation-complexity of user applications is also steadily increasingvideo and graphics applications are two major driving forces in smart mobile devices, which define the main application domain of interest in this dissertation. So, a systematic design methodology is highly required to implement such complex systems which contain dynamically changed behavior as well as computation-intensive workload that can be parallelized. A model-based approach is one of representative approaches for parallel embedded software development. Especially, HOPES framework is proposed which is a design environment for parallel embedded software supporting the overall design steps: system specification, performance estimation, design space exploration, and automatic code generation. Distinguished from other design environments, it introduces a novel concept of programming platform, called CIC (Common Intermediate Code) that can be understood as a generic execution model of heterogeneous multiprocessor architecture. The CIC task model is based on a process network model, but it can be refined to the SDF (Synchronous Data Flow) model, since it has a very desirable features for static analyzability as well as parallel processing. However, the SDF model has a typical weakness of expression capability, especially for the system-level specification and dynamically changed behavior of an application. To overcome this weakness, in this dissertation, we propose an extended CIC task model based on dataflow and FSM models to specify the dynamic behavior of the system distinguishing inter- and intra-application dynamism. At the top-level, each application is specified by a dataflow task and the dynamic behavior is modeled as a control task that supervises the execution of applications. Inside a dataflow task, it specifies the dynamic behavior using a similar way as FSM-based SADFan SDF task may have multiple behaviors and a tabular specification of an FSM, called MTM (Mode Transition Machine), describes the mode transition rules for the SDF graph. We call it to MTM-SDF model which is classified as multi-mode dataflow models in the dissertation. It assumes that an application has a finite number of behaviors (or modes) and each behavior (mode) is represented by an SDF graph. It enables us to perform compile-time scheduling of each graph to maximize the throughput varying the number of allocated processors, and store the scheduling information. Also, a multiprocessor scheduling technique is proposed for a multi-mode dataflow graph. While there exist several scheduling techniques for multi-mode dataflow models, no one allows task migration between modes. By observing that the resource requirement can be additionally reduced if task migration is allowed, we propose a multiprocessor scheduling technique of a multi-mode dataflow graph considering task migration between modes. Based on a genetic algorithm, the proposed technique schedules all SDF graphs in all modes simultaneously to minimize the resource requirement. To satisfy the throughput constraint, the proposed technique calculates the actual throughput requirement of each mode and the output buffer size for tolerating throughput jitter. For the specified task graph and scheduling results, the CIC translator generates parallelized code for the target architecture. Therefore the CIC translator is extended to support extended features of the CIC task model. In application-level, it is extended to support multiprocessor code generation for an MTM-SDF graph considering the given static scheduling results. Also, multiprocessor code generation of four different scheduling policies are supported for an MTM-SDF graph: fully-static, self-timed, static-assignment, and fully-dynamic. In system-level, the CIC translator is extended to support code generation for implementation of system request APIs and data structures for the static scheduling results and configurable task parameters. Through preliminary experiments with a multi-mode multimedia terminal example, the viability of the proposed methodology is verified.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Contribution 7 1.3 Dissertation organization 9 Chapter 2 Background 10 2.1 Related work 10 2.1.1 Compiler-based approach 10 2.1.2 Language-based approach 11 2.1.3 Model-based approach 15 2.2 HOPES framework 19 2.3 Common Intermediate Code (CIC) Model 21 Chapter 3 Dynamic Behavior Specification 26 3.1 Problem definition 26 3.1.1 System-level dynamic behavior 26 3.1.2 Application-level dynamic behavior 27 3.2 Related work 28 3.3 Motivational example 31 3.4 Control task specification for system-level dynamism 33 3.4.1 Internal specification 33 3.4.2 Action scripts 38 3.5 MTM-SDF specification for application-level dynamism 44 3.5.1 MTM specification 44 3.5.2 Task graph specification 45 3.5.3 Execution semantic of an MTM-SDF graph 46 Chapter 4 Multiprocessor Scheduling of an Multi-mode Dataflow Graph 50 4.1 Related work 51 4.2 Motivational example 56 4.2.1 Throughput requirement calculation considering mode transition delay 56 4.2.2 Task migration between mode transition 58 4.3 Problem definition 61 4.4 Throughput requirement analysis 65 4.4.1 Mode transition delay 66 4.4.2 Arrival curves of the output buffer 70 4.4.3 Buffer size determination 71 4.4.4 Throughput requirement analysis 73 4.5 Proposed MMDF scheduling framework 75 4.5.1 Optimization problem 75 4.5.2 GA configuration 76 4.5.3 Fitness function 78 4.5.4 Local optimization technique 79 4.6 Experimental results 81 4.6.1 MMDF scheduling technique 83 4.6.2 Scalability of the Proposed Framework 88 Chapter 5 Multiprocessor Code Generation for the Extended CIC Model 89 5.1 CIC translator 89 5.2 Code generation for application-level dynamism 91 5.2.1 Function call-style code generation (fully-static, self-timed) 94 5.2.2 Thread-style code generation (static-assignment, fully-dynamic) 98 5.3 Code generation for system-level dynamism 101 5.4 Experimental results 105 Chapter 6 Conclusion and Future Work 107 Bibliography 109 초둝 125Docto

    Hierarchical Transactions for Hardware/Software Cosynthesis

    Get PDF
    Modern heterogeneous devices provide of a variety of computationally diverse components holding tremendous performance and power capability. Hardware-software cosynthesis offers system-level synthesis and optimization opportunities to realize the potential of these evolving architectures. Efficiently coordinating high-throughput data to make use of available computational resources requires a myriad of distributed local memories, caching structures, and data motion resources. In fact, storage, caching, and data transfer components comprise the majority of silicon real estate. Conventional automated approaches, unfortunately, do not effectively represent applications in a way that captures data motion and state management which dictate dominant system costs. Consequently, existing cosynthesis methods suffer from poor utility of computational resources. Automated cosynthesis tailored towards memory-centric optimizations can address the challenge, adapting partitioning, scheduling, mapping, and binding techniques to maximize overall system utility.This research presents a novel hierarchical transaction model that formalizes state and control management through an abstract data/control encapsulation semantic. It is designed from the ground-up to enable efficient synthesis across heterogeneous system components, with an emphasis on memory capacity constraints. It intrinsically encourages a high degree of concurrency and latency tolerance, and provides verification tools to ensure correctness. A unique data/execution hierarchical encapsulation framework guarantees scalable analysis, supporting a novel concept of state and control mobility. A front-end language allows concise expression of designer intent, and is structured with synthesis in mind. Designers express families of valid executions in a minimal format through high-level dependencies, type systems, and computational relationships, allowing synthesis tools to manage lower-level details. This dissertation introduces and exercises the model, discussing language construction, demonstrating control and data-dominated applications, and presenting a synthesis path that exhibits near-linear scalability with problem size

    Adaptive Knobs for Resource Efficient Computing

    Get PDF
    Performance demands of emerging domains such as artificial intelligence, machine learning and vision, Internet-of-things etc., continue to grow. Meeting such requirements on modern multi/many core systems with higher power densities, fixed power and energy budgets, and thermal constraints exacerbates the run-time management challenge. This leaves an open problem on extracting the required performance within the power and energy limits, while also ensuring thermal safety. Existing architectural solutions including asymmetric and heterogeneous cores and custom acceleration improve performance-per-watt in specific design time and static scenarios. However, satisfying applications’ performance requirements under dynamic and unknown workload scenarios subject to varying system dynamics of power, temperature and energy requires intelligent run-time management. Adaptive strategies are necessary for maximizing resource efficiency, considering i) diverse requirements and characteristics of concurrent applications, ii) dynamic workload variation, iii) core-level heterogeneity and iv) power, thermal and energy constraints. This dissertation proposes such adaptive techniques for efficient run-time resource management to maximize performance within fixed budgets under unknown and dynamic workload scenarios. Resource management strategies proposed in this dissertation comprehensively consider application and workload characteristics and variable effect of power actuation on performance for pro-active and appropriate allocation decisions. Specific contributions include i) run-time mapping approach to improve power budgets for higher throughput, ii) thermal aware performance boosting for efficient utilization of power budget and higher performance, iii) approximation as a run-time knob exploiting accuracy performance trade-offs for maximizing performance under power caps at minimal loss of accuracy and iv) co-ordinated approximation for heterogeneous systems through joint actuation of dynamic approximation and power knobs for performance guarantees with minimal power consumption. The approaches presented in this dissertation focus on adapting existing mapping techniques, performance boosting strategies, software and dynamic approximations to meet the performance requirements, simultaneously considering system constraints. The proposed strategies are compared against relevant state-of-the-art run-time management frameworks to qualitatively evaluate their efficacy

    Hardware/Software Codesign of Embedded Systems with Reconfigurable and Heterogeneous Platforms

    Full text link

    Cost-Efficient Soft-Error Resiliency for ASIP-based Embedded Systems

    Full text link
    Recent decades have witnessed the rapid growth of embedded systems. At present, embedded systems are widely applied in a broad range of critical applications including automotive electronics, telecommunication, healthcare, industrial electronics, consumer electronics military and aerospace. Human society will continue to be greatly transformed by the pervasive deployment of embedded systems. Consequently, substantial amount of efforts from both industry and academic communities have contributed to the research and development of embedded systems. Application-specific instruction-set processor (ASIP) is one of the key advances in embedded processor technology, and a crucial component in some embedded systems. Soft errors have been directly observed since the 1970s. As devices scale, the exponential increase in the integration of computing systems occurs, which leads to correspondingly decrease in the reliability of computing systems. Today, major research forums state that soft errors are one of the major design technology challenges at and beyond the 22 nm technology node. Therefore, a large number of soft-error solutions, including error detection and recovery, have been proposed from differing perspectives. Nonetheless, most of the existing solutions are designed for general or high-performance systems which are different to embedded systems. For embedded systems, the soft-error solutions must be cost-efficient, which requires the tailoring of the processor architecture with respect to the feature of the target application. This thesis embodies a series of explorations for cost-efficient soft-error solutions for ASIP-based embedded systems. In this exploration, five major solutions are proposed. The first proposed solution realizes checkpoint recovery in ASIPs. By generating customized instructions, ASIP-implemented checkpoint recovery can perform at a finer granularity than what was previously possible. The fault-free performance overhead of this solution is only 1.45% on average. The recovery delay is only 62 cycles at the worst case. The area and leakage power overheads are 44.4% and 45.6% on average. The second solution explores utilizing two primitive error recovery techniques jointly. This solution includes three application-specific optimization methodologies. This solution generates the optimized error-resilient ASIPs, based on the characteristics of primitive error recovery techniques, static reliability analysis and design constraints. The resultant ASIP can be configured to perform at runtime according to the optimized recovery scheme. This solution can strategically enhance cost-efficiency for error recovery. In order to guarantee cost-efficiency in unpredictable runtime situations, the third solution explores runtime adaptation for error recovery. This solution aims to budget and adapt the error recovery operations, so as to spend the resources intelligently and to tolerate adverse influences of runtime variations. The resultant ASIP can make runtime decisions to determine the activation of spatial and temporal redundancies, according to the runtime situations. At the best case, this solution can achieve almost 50x reliability gain over the state of the art solutions. Given the increasing demand for multi-core computing systems, the last two proposed solutions target error recovery in multi-core ASIPs. The first solution of these two explores ASIP-implemented fine-grained process migration. This solution is a key infrastructure, which allows cost-efficient task management, for realizing cost-efficient soft-error recovery in multi-core ASIPs. The average time cost is only 289 machine cycles to perform process migration. The last solution explores using dynamic and adaptive mapping to assign heterogeneous recovery operations to the tasks in the multi-core context. This solution allows each individual ASIP-based processing core to dynamically adapt its specific error recovery functionality according to the corresponding task's characteristics, in terms of soft error vulnerability and execution time deadline. This solution can significantly improve the reliability of the system by almost two times, with graceful constraint penalty, in comparison to the state-of-the-art counterparts

    A Generic Framework for Design Space Exploration

    Get PDF