6,550 research outputs found

    Run-time management for future MPSoC platforms

    Get PDF
    In recent years, we are witnessing the dawning of the Multi-Processor Systemon- Chip (MPSoC) era. In essence, this era is triggered by the need to handle more complex applications, while reducing overall cost of embedded (handheld) devices. This cost will mainly be determined by the cost of the hardware platform and the cost of designing applications for that platform. The cost of a hardware platform will partly depend on its production volume. In turn, this means that ??exible, (easily) programmable multi-purpose platforms will exhibit a lower cost. A multi-purpose platform not only requires ??exibility, but should also combine a high performance with a low power consumption. To this end, MPSoC devices integrate computer architectural properties of various computing domains. Just like large-scale parallel and distributed systems, they contain multiple heterogeneous processing elements interconnected by a scalable, network-like structure. This helps in achieving scalable high performance. As in most mobile or portable embedded systems, there is a need for low-power operation and real-time behavior. The cost of designing applications is equally important. Indeed, the actual value of future MPSoC devices is not contained within the embedded multiprocessor IC, but in their capability to provide the user of the device with an amount of services or experiences. So from an application viewpoint, MPSoCs are designed to ef??ciently process multimedia content in applications like video players, video conferencing, 3D gaming, augmented reality, etc. Such applications typically require a lot of processing power and a signi??cant amount of memory. To keep up with ever evolving user needs and with new application standards appearing at a fast pace, MPSoC platforms need to be be easily programmable. Application scalability, i.e. the ability to use just enough platform resources according to the user requirements and with respect to the device capabilities is also an important factor. Hence scalability, ??exibility, real-time behavior, a high performance, a low power consumption and, ??nally, programmability are key components in realizing the success of MPSoC platforms. The run-time manager is logically located between the application layer en the platform layer. It has a crucial role in realizing these MPSoC requirements. As it abstracts the platform hardware, it improves platform programmability. By deciding on resource assignment at run-time and based on the performance requirements of the user, the needs of the application and the capabilities of the platform, it contributes to ??exibility, scalability and to low power operation. As it has an arbiter function between different applications, it enables real-time behavior. This thesis details the key components of such an MPSoC run-time manager and provides a proof-of-concept implementation. These key components include application quality management algorithms linked to MPSoC resource management mechanisms and policies, adapted to the provided MPSoC platform services. First, we describe the role, the responsibilities and the boundary conditions of an MPSoC run-time manager in a generic way. This includes a de??nition of the multiprocessor run-time management design space, a description of the run-time manager design trade-offs and a brief discussion on how these trade-offs affect the key MPSoC requirements. This design space de??nition and the trade-offs are illustrated based on ongoing research and on existing commercial and academic multiprocessor run-time management solutions. Consequently, we introduce a fast and ef??cient resource allocation heuristic that considers FPGA fabric properties such as fragmentation. In addition, this thesis introduces a novel task assignment algorithm for handling soft IP cores denoted as hierarchical con??guration. Hierarchical con??guration managed by the run-time manager enables easier application design and increases the run-time spatial mapping freedom. In turn, this improves the performance of the resource assignment algorithm. Furthermore, we introduce run-time task migration components. We detail a new run-time task migration policy closely coupled to the run-time resource assignment algorithm. In addition to detailing a design-environment supported mechanism that enables moving tasks between an ISP and ??ne-grained recon??gurable hardware, we also propose two novel task migration mechanisms tailored to the Network-on-Chip environment. Finally, we propose a novel mechanism for task migration initiation, based on reusing debug registers in modern embedded microprocessors. We propose a reactive on-chip communication management mechanism. We show that by exploiting an injection rate control mechanism it is possible to provide a communication management system capable of providing a soft (reactive) QoS in a NoC. We introduce a novel, platform independent run-time algorithm to perform quality management, i.e. to select an application quality operating point at run-time based on the user requirements and the available platform resources, as reported by the resource manager. This contribution also proposes a novel way to manage the interaction between the quality manager and the resource manager. In order to have a the realistic, reproducible and ??exible run-time manager testbench with respect to applications with multiple quality levels and implementation tradev offs, we have created an input data generation tool denoted Pareto Surfaces For Free (PSFF). The the PSFF tool is, to the best of our knowledge, the ??rst tool that generates multiple realistic application operating points either based on pro??ling information of a real-life application or based on a designer-controlled random generator. Finally, we provide a proof-of-concept demonstrator that combines these concepts and shows how these mechanisms and policies can operate for real-life situations. In addition, we show that the proposed solutions can be integrated into existing platform operating systems

    Can the UNAIDS 90-90-90 target be achieved? A systematic analysis of national HIV treatment cascades

    Get PDF
    Background In 2014, the Joint United Nations Programme on HIV and AIDS (UNAIDS) and partners set the β€˜90-90-90 targets’; aiming to diagnose 90% of all HIV positive people, provide antiretroviral therapy (ART) for 90% of those diagnosed and achieve viral suppression for 90% of those treated, by 2020. This results in 81% of all HIV positive people on treatment and 73% of all HIV positive people achieving viral suppression. We aimed to analyse how effective national HIV treatment programmes are at meeting these targets, using HIV care continuums or cascades. Methods We searched for HIV treatment cascades for 196 countries in published papers, conference presentations, UNAIDS databases and national reports. Cascades were constructed using reliable, generalisable, recent data from national, cross-sectional and longitudinal study cohorts. Data were collected for four stages; total HIV positive people, diagnosed, on treatment and virally suppressed. The cascades were categorised as complete (four stages) or partial (3 stages), and analysed for β€˜break points’ defined as a drop >10% in coverage between consecutive 90-90-90 targets. Results 69 country cascades were analysed (32 complete, 37 partial). Diagnosis (target oneβ€”90%) ranged from 87% (the Netherlands) to 11% (Yemen). Treatment coverage (target twoβ€”81% on ART) ranged from 71% (Switzerland) to 3% (Afghanistan). Viral suppression (target threeβ€”73% virally suppressed) was between 68% (Switzerland) and 7% (China). Conclusions No country analysed met the 90-90-90 targets. Diagnosis was the greatest break point globally, but the most frequent key break point for individual countries was providing ART to those diagnosed. Large disparities were identified between countries. Without commitment to standardised reporting methodologies, international comparisons are complex

    NSSDC Conference on Mass Storage Systems and Technologies for Space and Earth Science Applications, volume 2

    Get PDF
    This report contains copies of nearly all of the technical papers and viewgraphs presented at the NSSDC Conference on Mass Storage Systems and Technologies for Space and Earth Science Application. This conference served as a broad forum for the discussion of a number of important issues in the field of mass storage systems. Topics include the following: magnetic disk and tape technologies; optical disk and tape; software storage and file management systems; and experiences with the use of a large, distributed storage system. The technical presentations describe, among other things, integrated mass storage systems that are expected to be available commercially. Also included is a series of presentations from Federal Government organizations and research institutions covering their mass storage requirements for the 1990's

    The mechanisms of leukocyte removal by filtration

    Get PDF

    온 μΉ© λ„€νŠΈμ›Œν¬ 섀계: 맀핑, 관리, λΌμš°νŒ…

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사)-- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : 전기·정보곡학뢀, 2016. 2. 졜기영.μ§€λ‚œ μˆ˜μ‹­ λ…„κ°„ 이어진 λ°˜λ„μ²΄ 기술의 ν–₯상은 λ§€λ‹ˆ μ½”μ–΄μ˜ μ‹œλŒ€λ₯Ό κ°€μ Έλ‹€ μ£Όμ—ˆλ‹€. μš°λ¦¬κ°€ 일상 μƒν™œμ— μ“°λŠ” λ°μŠ€ν¬ν†± 컴퓨터쑰차도 이미 수 개의 μ½”μ–΄λ₯Ό 가지고 있으며, 수백 개의 μ½”μ–΄λ₯Ό 가진 칩도 μƒμš©ν™”λ˜μ–΄ μžˆλ‹€. μ΄λŸ¬ν•œ λ§Žμ€ μ½”μ–΄λ“€ κ°„μ˜ 톡신 κΈ°λ°˜μœΌλ‘œμ„œ, λ„€νŠΈμ›Œν¬-온-μΉ©(NoC)이 μƒˆλ‘œμ΄ λŒ€λ‘λ˜μ—ˆμœΌλ©°, μ΄λŠ” ν˜„μž¬ λ§Žμ€ 연ꡬ 및 μƒμš© μ œν’ˆμ—μ„œ 널리 μ‚¬μš©λ˜κ³  μžˆλ‹€. κ·ΈλŸ¬λ‚˜ λ„€νŠΈμ›Œν¬-온-칩을 λ§€λ‹ˆ μ½”μ–΄ μ‹œμŠ€ν…œμ— μ‚¬μš©ν•˜λŠ” λ°μ—λŠ” μ—¬λŸ¬ 가지 λ¬Έμ œκ°€ λ”°λ₯΄λ©°, λ³Έ λ…Όλ¬Έμ—μ„œλŠ” κ·Έ 쀑 λͺ‡ 가지λ₯Ό ν’€μ–΄λ‚΄κ³ μž ν•˜μ˜€λ‹€. λ³Έ λ…Όλ¬Έμ˜ 두 번째 μ±•ν„°μ—μ„œλŠ” NoC 기반 λ§€λ‹ˆμ½”μ–΄ ꡬ쑰에 μž‘μ—…μ„ ν• λ‹Ήν•˜κ³  μŠ€μΌ€μ₯΄ν•˜λŠ” 방법을 λ‹€λ£¨μ—ˆλ‹€. λ§€λ‹ˆμ½”μ–΄μ—μ˜ μž‘μ—… 할당을 닀룬 논문은 이미 많이 μΆœνŒλ˜μ—ˆμ§€λ§Œ, λ³Έ μ—°κ΅¬λŠ” λ©”μ‹œμ§€ νŒ¨μ‹±κ³Ό 곡유 λ©”λͺ¨λ¦¬, 두 κ°€μ§€μ˜ 톡신 방식을 κ³ λ €ν•¨μœΌλ‘œμ¨ μ„±λŠ₯κ³Ό μ—λ„ˆμ§€ νš¨μœ¨μ„ κ°œμ„ ν•˜μ˜€λ‹€. λ˜ν•œ, λ³Έ μ—°κ΅¬λŠ” μ—­λ°©ν–₯ μ˜μ‘΄μ„±μ„ 가진 μž‘μ—… κ·Έλž˜ν”„λ₯Ό μŠ€μΌ€μ₯΄ν•˜λŠ” 방법 λ˜ν•œ μ œμ‹œν•˜μ˜€λ‹€. 3차원 적측 κΈ°μˆ μ€ 높아진 μ „λ ₯ 밀도 λ•Œλ¬Έμ— μ—΄ λ¬Έμ œκ°€ μ‹¬κ°ν•΄μ§€λŠ” λ“±, μ—¬λŸ¬ 가지 도전 과제λ₯Ό λ‚΄ν¬ν•˜κ³  μžˆλ‹€. μ„Έ 번째 μ±•ν„°μ—μ„œλŠ” DVFS κΈ°μˆ μ„ μ΄μš©ν•˜μ—¬ μ—΄ 문제λ₯Ό μ™„ν™”ν•˜κ³ μž ν•˜λŠ” κΈ°μˆ μ„ μ†Œκ°œν•œλ‹€. 각 코어와 λΌμš°ν„°κ°€ μ „μ••, μž‘λ™ 속도λ₯Ό μ‘°μ ˆν•  수 μžˆλŠ” κ΅¬μ‘°μ—μ„œ, κ°€μž₯ 높은 μ„±λŠ₯을 μ΄λŒμ–΄ λ‚΄λ©΄μ„œλ„ μ΅œλŒ€ μ˜¨λ„λ₯Ό λ„˜μ–΄μ„œμ§€ μ•Šλ„λ‘ ν•œλ‹€. μ„Έ λ²ˆμ§Έμ™€ λ„€ 번째 μ±•ν„°λŠ” 쑰금 λ‹€λ₯Έ 츑면을 닀룬닀. 3D 적측 κΈ°μˆ μ„ μ‚¬μš©ν•  λ•Œ, μΈ΅κ°„ 톡신은 주둜 TSVλ₯Ό μ΄μš©ν•˜μ—¬ 이루어진닀. κ·ΈλŸ¬λ‚˜ TSVλŠ” 일반 wire보닀 훨씬 큰 면적을 μ°¨μ§€ν•˜κΈ° λ•Œλ¬Έμ—, 전체 λ„€νŠΈμ›Œν¬μ—μ„œμ˜ TSV κ°œμˆ˜λŠ” μ œν•œλ˜μ–΄μ•Ό ν•  κ²½μš°κ°€ λ§Žλ‹€. 이 κ²½μš°μ—λŠ” 두 가지 선택지가 μžˆλŠ”λ°, μ²«μ§ΈλŠ” 각 μΈ΅κ°„ 톡신 μ±„λ„μ˜ λŒ€μ—­ν­μ„ μ€„μ΄λŠ” 것이고, λ‘˜μ§ΈλŠ” 각 μ±„λ„μ˜ λŒ€μ—­ν­μ€ μœ μ§€ν•˜λ˜ 일뢀 λ…Έλ“œλ§Œ μΈ΅κ°„ 톡신이 κ°€λŠ₯ν•œ 채널을 μ œκ³΅ν•˜λŠ” 것이닀. μš°λ¦¬λŠ” 각각의 κ²½μš°μ— λŒ€ν•˜μ—¬ λΌμš°νŒ… μ•Œκ³ λ¦¬μ¦˜μ„ ν•˜λ‚˜μ”© μ œμ‹œν•œλ‹€. 첫 번째 κ²½μš°μ— μžˆμ–΄μ„œλŠ” deflection λΌμš°νŒ… 기법을 μ‚¬μš©ν•˜μ—¬ μΈ΅κ°„ ν†΅μ‹ μ˜ κΈ΄ 지연 μ‹œκ°„μ„ κ·Ήλ³΅ν•˜κ³ μž ν•˜μ˜€λ‹€. μΈ΅κ°„ 톡신을 κ· λ“±ν•˜κ²Œ λΆ„λ°°ν•¨μœΌλ‘œμ¨, μ œμ‹œλœ μ•Œκ³ λ¦¬μ¦˜μ€ κ°œμ„ λœ 지연 μ‹œκ°„μ„ 보이며 λΌμš°ν„° λ²„νΌμ˜ 제거λ₯Ό ν†΅ν•œ 면적 및 μ—λ„ˆμ§€ νš¨μœ¨μ„± λ˜ν•œ 얻을 수 μžˆλ‹€. 두 번째 κ²½μš°μ—μ„œλŠ” μΈ΅κ°„ 톡신 채널을 μ„ νƒν•˜κΈ° μœ„ν•œ λͺ‡ 가지 κ·œμΉ™μ„ μ œμ‹œν•œλ‹€. μ•½κ°„μ˜ λΌμš°νŒ… μžμœ λ„λ₯Ό ν¬μƒν•¨μœΌλ‘œμ¨, μ œμ‹œλœ μ•Œκ³ λ¦¬μ¦˜μ€ κΈ°μ‘΄ μ•Œκ³ λ¦¬μ¦˜μ˜ 가상 채널 μš”κ΅¬ 쑰건을 μ œκ±°ν•˜κ³ , κ²°κ³Όμ μœΌλ‘œλŠ” μ„±λŠ₯ λ˜λŠ” μ—λ„ˆμ§€ 효율의 증가λ₯Ό κ°€μ Έ μ˜¨λ‹€.For decades, advance in semiconductor technology has led us to the era of many-core systems. Today's desktop computers already have multi-core processors, and chips with more than a hundred cores are commercially available. As a communication medium for such a large number of cores, network-on-chip (NoC) has emerged out, and now is being used by many researchers and companies. Adopting NoC for a many-core system incurs many problems, and this thesis tries to solve some of them. The second chapter of this thesis is on mapping and scheduling of tasks on NoC-based CMP architectures. Although mapping on NoC has a number of papers published, our work reveals that selecting communication types between shared memory and message passing can help improve the performance and energy efficiency. Additionally, our framework supports scheduling applications containing backward dependencies with the help of modified modulo scheduling. Evolving the SoCs through 3D stacking makes us face a number of new problems, and the thermal problem coming from increased power density is one of them. In the third chapter of this thesis, we try to mitigate the hotspot problem using DVFS techniques. Assuming that all the routers as well as cores have capabilities to control voltage and frequency individually, we find voltage-frequency pairs for all cores and routers which yields the best performance within the given thermal constraint. The fourth and the fifth chapters of this thesis are from a different aspect. In 3D stacking, inter-layer interconnections are implemented using through-silicon vias (TSV). TSVs usually take much more area than normal wires. Furthermore, they also consume silicon area as well as metal area. For this reason, designers would want to limit the number of TSVs used in their network. To limit the TSV count, there are two options: the first is to reduce the width of each vertical links, and the other is to use fewer vertical links, which results in a partially connected network. We present two routing methodologies for each case. For the network with reduced bandwidth vertical links, we propose using deflection routing to mitigate the long latency of vertical links. By balancing the vertical traffics properly, the algorithm provides improved latency. Also, a large amount of area and energy reduction can be obtained by the removal of router buffers. For partially connected networks, we introduce a set of routing rules for selecting the vertical links. At the expense of sacrificing some amount of routing freedom, the proposed algorithm removes the virtual channel requirement for avoiding deadlock. As a result, the performance, or energy consumption can be reduced at the designer's choice.Chapter 1 Introduction 1 1.1 Task Mapping and Scheduling 2 1.2 Thermal Management 3 1.3 Routing for 3D Networks 5 Chapter 2 Mapping and Scheduling 9 2.1 Introduction 9 2.2 Motivation 10 2.3 Background 12 2.4 Related Work 16 2.5 Platform Description 17 2.5.1 Architcture Description 17 2.5.2 Energy Model 21 2.5.3 Communication Delay Model 22 2.6 Problem Formulation 23 2.7 Proposed Solution 25 2.7.1 Task and Communication Mapping 27 2.7.2 Communication Type Optimization 31 2.7.3 Design Space Pruning via Pre-evaluation 34 2.7.4 Scheduling 35 2.8 Experimental Results 42 2.8.1 Experiments with Coarse-grained Iterative Modulo Scheduling 42 2.8.2 Comparison with Different Mapping Algorithms 43 2.8.3 Experiments with Overall Algorithms 45 2.8.4 Experiments with Various Local Memory Sizes 47 2.8.5 Experiments with Various Placements of Shared Memory 48 Chapter 3 Thermal Management 50 3.1 Introduction 50 3.2 Background 51 3.2.1 Thermal Modeling 51 3.2.2 Heterogeneity in Thermal Propagation 52 3.3 Motivation and Problem Definition 53 3.4 Related Work 56 3.5 Orchestrated Voltage-Frequency Assignment 56 3.5.1 Individual PI Control Method 56 3.5.2 PI Controlled Weighted-Power Budgeting 57 3.5.3 Performance/Power Estimation 59 3.5.4 Frequency Assignment 62 3.5.5 Algorithm Overview 64 3.5.6 Stability Conditions for PI Controller 65 3.6 Experimental Result 66 3.6.1 Experimental Setup 66 3.6.2 Overall Algorithm Performance 68 3.6.3 Accuracy of the Estimation Model 70 3.6.4 Performance of the Frequency Assignment Algorithm 70 Chapter 4 Routing for Limited Bandwidth 3D NoC 72 4.1 Introduction 72 4.2 Motivation 73 4.3 Background 74 4.4 Related Work 75 4.5 3D Deflection Routing 76 4.5.1 Serialized TSV Model 76 4.5.2 TSV Link Injection/ejection Scheme 78 4.5.3 Deadlock Avoidance 80 4.5.4 Livelock Avoidance 84 4.5.5 Router Architecture: Putting It All Together 86 4.5.6 System Level Consideration 87 4.6 Experimental Results 89 4.6.1 Experimental Setup 89 4.6.2 Results on Synthetic Traffic Patterns 91 4.6.3 Results on Realistic Traffic Patterns 94 4.6.4 Results on Real Application Benchmarks 98 4.6.5 Fairness Issue 103 4.6.6 Area Cost Comparison 104 Chapter 5 Routing for Partially Connected 3D NoC 106 5.1 Introduction 106 5.2 Background 107 5.3 Related Work 109 5.4 Proposed Algorithm 111 5.4.1 Preliminary 112 5.4.2 Routing Algorithm for 3-D Stacked Meshes with Regular Partial Vertical Connections 115 5.4.3 Routing Algorithm for 3-D Stacked Meshes with Irregular Partial Vertical Connections 118 5.4.4 Extension to Heterogeneous Mesh Layers 122 5.5 Experimental Results 126 5.5.1 Experimental Setup 126 5.5.2 Experiments on Synthetic Traffics 128 5.5.3 Experiments on Application Benchmarks 133 5.5.4 Comparison with Reduced Bandwidth Mesh 139 Chapter 6 Conclusion 141 Bibliography 144 초둝 163Docto

    Theories and quantification of thymic selection

    Get PDF
    The peripheral T cell repertoire is sculpted from prototypic T cells in the thymus bearing randomly generated T cell receptors (TCR) and by a series of developmental and selection steps that remove cells that are unresponsive or overly reactive to self-peptide–MHC complexes. The challenge of understanding how the kinetics of T cell development and the statistics of the selection processes combine to provide a diverse but self-tolerant T cell repertoire has invited quantitative modeling approaches, which are reviewed here

    TB STIGMA – MEASUREMENT GUIDANCE

    Get PDF
    TB is the most deadly infectious disease in the world, and stigma continues to play a significant role in worsening the epidemic. Stigma and discrimination not only stop people from seeking care but also make it more difficult for those on treatment to continue, both of which make the disease more difficult to treat in the long-term and mean those infected are more likely to transmit the disease to those around them. TB Stigma – Measurement Guidance is a manual to help generate enough information about stigma issues to design and monitor and evaluate efforts to reduce TB stigma. It can help in planning TB stigma baseline measurements and monitoring trends to capture the outcomes of TB stigma reduction efforts. This manual is designed for health workers, professional or management staff, people who advocate for those with TB, and all who need to understand and respond to TB stigma

    Review of the occupational health and safety of Britain’s ethnic minorities

    Get PDF
    This report sets out an evidence-based review on work-related health and safety issues relating to black and minority ethnic groups. Data included available statistical materials and a systematic review of published research and practice-based reports. UK South Asians are generally under-represented within the most hazardous occupational groups. They have lower accident rates overall, while Black Caribbean workers rates are similar to the general population; Bangladeshi and Chinese workers report lowest workplace injury rates UK South Asian people exhibit higher levels of limiting long-term illness (LLI) and self reported poor health than the general population while Black Africans and Chinese report lower levels. Ethnic minority workers with LLI are more likely than whites to withdraw from the workforce, or to experience lower wage rates. Some of these findings conflict with evidence of differentials from USA, Europe and Australasia, but there is a dearth of effective primary research or reliable monitoring data from UK sources. There remains a need to improve monitoring and data collection relating to black and ethnic minority populations and migrant workers. Suggestions are made relating to workshops on occupational health promotion programmes for ethnic minorities, and ethnic minority health and safety 'Beacon' sites

    μ‹€μ‹œκ°„ μž„λ² λ””λ“œ μ‹œμŠ€ν…œμ„ μœ„ν•œ 동적 ν–‰μœ„ λͺ…μ„Έ 및 섀계 곡간 탐색 기법

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사)-- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : 전기·컴퓨터곡학뢀, 2016. 8. ν•˜μˆœνšŒ.ν•˜λ‚˜μ˜ 칩에 μ§‘μ λ˜λŠ” ν”„λ‘œμ„Έμ„œμ˜ κ°œμˆ˜κ°€ λ§Žμ•„μ§€κ³ , λ§Žμ€ κΈ°λŠ₯듀이 톡합됨에 따라, μ—°μ‚°μ–‘μ˜ λ³€ν™”, μ„œλΉ„μŠ€μ˜ ν’ˆμ§ˆ, μ˜ˆμƒμΉ˜ λͺ»ν•œ μ‹œμŠ€ν…œ μš”μ†Œμ˜ κ³ μž₯ λ“±κ³Ό 같은 λ‹€μ–‘ν•œ μš”μ†Œλ“€μ— μ˜ν•΄ μ‹œμŠ€ν…œμ˜ μƒνƒœκ°€ λ™μ μœΌλ‘œ λ³€ν™”ν•˜κ²Œ λœλ‹€. λ°˜λ©΄μ—, λ³Έ λ…Όλ¬Έμ—μ„œ 주된 관심사λ₯Ό κ°€μ§€λŠ” 슀마트 폰 μž₯μΉ˜μ—μ„œ 주둜 μ‚¬μš©λ˜λŠ” λΉ„λ””μ˜€, κ·Έλž˜ν”½ μ‘μš©λ“€μ˜ 경우, 계산 λ³΅μž‘λ„κ°€ μ§€μ†μ μœΌλ‘œ μ¦κ°€ν•˜κ³  μžˆλ‹€. λ”°λΌμ„œ, μ΄λ ‡κ²Œ λ™μ μœΌλ‘œ λ³€ν•˜λŠ” ν–‰μœ„λ₯Ό κ°€μ§€λ©΄μ„œλ„ 병렬성을 λ‚΄μ œν•œ 계산 집약적인 연산을 ν¬ν•¨ν•˜λŠ” λ³΅μž‘ν•œ μ‹œμŠ€ν…œμ„ κ΅¬ν˜„ν•˜κΈ° μœ„ν•΄μ„œλŠ” 체계적인 섀계 방법둠이 κ³ λ„λ‘œ μš”κ΅¬λœλ‹€. λͺ¨λΈ 기반 방법둠은 병렬 μž„λ² λ””λ“œ μ†Œν”„νŠΈμ›¨μ–΄ κ°œλ°œμ„ μœ„ν•œ λŒ€ν‘œμ μΈ 방법 쀑 ν•˜λ‚˜μ΄λ‹€. 특히, μ‹œμŠ€ν…œ λͺ…μ„Έ, 정적 μ„±λŠ₯ 뢄석, 섀계 곡간 탐색, 그리고 μžλ™ μ½”λ“œ μƒμ„±κΉŒμ§€μ˜ λͺ¨λ“  섀계 단계λ₯Ό μ§€μ›ν•˜λŠ” 병렬 μž„λ² λ””λ“œ μ†Œν”„νŠΈμ›¨μ–΄ 섀계 ν™˜κ²½μœΌλ‘œμ„œ, HOPES ν”„λ ˆμž„μ›Œν¬κ°€ μ œμ‹œλ˜μ—ˆλ‹€. λ‹€λ₯Έ 섀계 ν™˜κ²½λ“€κ³ΌλŠ” λ‹€λ₯΄κ²Œ, 이기쒅 λ©€ν‹°ν”„λ‘œμ„Έμ„œ μ•„ν‚€ν…μ²˜μ—μ„œμ˜ 일반적인 μˆ˜ν–‰ λͺ¨λΈλ‘œμ„œ, 곡톡 쀑간 μ½”λ“œ (CIC) 라고 λΆ€λ₯΄λŠ” ν”„λ‘œκ·Έλž˜λ° ν”Œλž«νΌμ΄λΌλŠ” μƒˆλ‘œμš΄ κ°œλ…μ„ μ†Œκ°œν•˜μ˜€λ‹€. CIC νƒœμŠ€ν¬ λͺ¨λΈμ€ ν”„λ‘œμ„ΈμŠ€ λ„€νŠΈμ›Œν¬ λͺ¨λΈμ— κΈ°λ°˜ν•˜κ³  μžˆμ§€λ§Œ, SDF λͺ¨λΈλ‘œ ꡬ체화될 수 있기 λ•Œλ¬Έμ—, 병렬 처리뿐만 μ•„λ‹ˆλΌ 정적 뢄석이 μš©μ΄ν•˜λ‹€λŠ” μž₯점을 가진닀. ν•˜μ§€λ§Œ, SDF λͺ¨λΈμ€ μ‘μš©μ˜ 동적인 ν–‰μœ„λ₯Ό λͺ…μ„Έν•  수 μ—†λ‹€λŠ” ν‘œν˜„μƒμ˜ μ œμ•½μ„ 가진닀. μ΄λŸ¬ν•œ μ œμ•½μ„ κ·Ήλ³΅ν•˜κ³ , μ‹œμŠ€ν…œμ˜ 동적 ν–‰μœ„λ₯Ό μ‘μš© 외뢀와 λ‚΄λΆ€λ‘œ κ΅¬λΆ„ν•˜μ—¬ λͺ…μ„Έν•˜κΈ° μœ„ν•΄, λ³Έ λ…Όλ¬Έμ—μ„œλŠ” 데이터 ν”Œλ‘œμš°μ™€ μœ ν•œμƒνƒœκΈ° (FSM) λͺ¨λΈμ— κΈ°λ°˜ν•˜μ—¬ ν™•μž₯된 CIC νƒœμŠ€ν¬ λͺ¨λΈμ„ μ œμ•ˆν•œλ‹€. μƒμœ„ μˆ˜μ€€μ—μ„œλŠ”, 각 μ‘μš©μ€ 데이터 ν”Œλ‘œμš° νƒœμŠ€ν¬λ‘œ λͺ…μ„Έ 되며, 동적 ν–‰μœ„λŠ” μ‘μš©λ“€μ˜ μˆ˜ν–‰μ„ κ°λ…ν•˜λŠ” μ œμ–΄ νƒœμŠ€ν¬λ‘œ λͺ¨λΈ λœλ‹€. 데이터 ν”Œλ‘œμš° νƒœμŠ€ν¬ λ‚΄λΆ€λŠ”, μœ ν•œμƒνƒœκΈ° 기반의 SADF λͺ¨λΈκ³Ό μœ μ‚¬ν•œ ν˜•νƒœλ‘œ 동적 ν–‰μœ„κ°€ λͺ…μ„Έ λœλ‹€SDF νƒœμŠ€ν¬λŠ” 볡수개의 ν–‰μœ„λ₯Ό κ°€μ§ˆ 수 있으며, λͺ¨λ“œ μ „ν™˜κΈ° (MTM)이라고 λΆˆλ¦¬λŠ” μœ ν•œ μƒνƒœκΈ°μ˜ ν…Œμ΄λΈ” ν˜•νƒœμ˜ λͺ…μ„Έλ₯Ό 톡해 SDF κ·Έλž˜ν”„μ˜ λͺ¨λ“œ μ „ν™˜ κ·œμΉ™μ„ λͺ…μ„Έ ν•œλ‹€. 이λ₯Ό MTM-SDF κ·Έλž˜ν”„λΌκ³  λΆ€λ₯΄λ©°, 볡수 λͺ¨λ“œ 데이터 ν”Œλ‘œμš° λͺ¨λΈ 쀑 ν•˜λ‚˜λΌ κ΅¬λΆ„λœλ‹€. μ‘μš©μ€ μœ ν•œν•œ ν–‰μœ„ (λ˜λŠ” λͺ¨λ“œ)λ₯Ό 가지며, 각 ν–‰μœ„ (λͺ¨λ“œ)λŠ” SDF κ·Έλž˜ν”„λ‘œ ν‘œν˜„λ˜λŠ” 것을 κ°€μ •ν•œλ‹€. 이λ₯Ό 톡해 λ‹€μ–‘ν•œ ν”„λ‘œμ„Έμ„œ κ°œμˆ˜μ— λŒ€ν•΄ λ‹¨μœ„μ‹œκ°„λ‹Ή μ²˜λ¦¬λŸ‰μ„ μ΅œλŒ€ν™”ν•˜λŠ” 컴파일-μ‹œκ°„ μŠ€μΌ€μ€„λ§μ„ μˆ˜ν–‰ν•˜κ³ , μŠ€μΌ€μ€„ κ²°κ³Όλ₯Ό μ €μž₯ν•  수 μžˆλ„λ‘ ν•œλ‹€. λ˜ν•œ, 볡수 λͺ¨λ“œ 데이터 ν”Œλ‘œμš° κ·Έλž˜ν”„λ₯Ό μœ„ν•œ λ©€ν‹°ν”„λ‘œμ„Έμ„œ μŠ€μΌ€μ€„λ§ 기법을 μ œμ‹œν•œλ‹€. 볡수 λͺ¨λ“œ 데이터 ν”Œλ‘œμš° κ·Έλž˜ν”„λ₯Ό μœ„ν•œ λͺ‡λͺ‡ μŠ€μΌ€μ€„λ§ 기법듀이 μ‘΄μž¬ν•˜μ§€λ§Œ, λͺ¨λ“œ 사이에 νƒœμŠ€ν¬ 이주λ₯Ό ν—ˆμš©ν•œ 기법듀은 μ‘΄μž¬ν•˜μ§€ μ•ŠλŠ”λ‹€. ν•˜μ§€λ§Œ νƒœμŠ€ν¬ 이주λ₯Ό ν—ˆμš©ν•˜κ²Œ 되면 μžμ› μš”κ΅¬λŸ‰μ„ 쀄일 수 μžˆλ‹€λŠ” λ°œκ²¬μ„ 톡해, λ³Έ λ…Όλ¬Έμ—μ„œλŠ” λͺ¨λ“œ μ‚¬μ΄μ˜ νƒœμŠ€ν¬ 이주λ₯Ό ν—ˆμš©ν•˜λŠ” 볡수 λͺ¨λ“œ 데이터 ν”Œλ‘œμš° κ·Έλž˜ν”„λ₯Ό μœ„ν•œ λ©€ν‹°ν”„λ‘œμ„Έμ„œ μŠ€μΌ€μ€„λ§ 기법을 μ œμ•ˆν•œλ‹€. μœ μ „ μ•Œκ³ λ¦¬μ¦˜μ— κΈ°λ°˜ν•˜μ—¬, μ œμ•ˆν•˜λŠ” 기법은 μžμ› μš”κ΅¬λŸ‰μ„ μ΅œμ†Œν™”ν•˜κΈ° μœ„ν•΄ 각 λͺ¨λ“œμ— ν•΄λ‹Ήν•˜λŠ” λͺ¨λ“  SDF κ·Έλž˜ν”„λ₯Ό λ™μ‹œμ— μŠ€μΌ€μ€„ ν•œλ‹€. 주어진 λ‹¨μœ„ μ‹œκ°„λ‹Ή μ²˜λ¦¬λŸ‰ μ œμ•½μ„ λ§Œμ‘±μ‹œν‚€κΈ° μœ„ν•΄, μ œμ•ˆν•˜λŠ” 기법은 각 λͺ¨λ“œ λ³„λ‘œ μ‹€μ œ μ²˜λ¦¬λŸ‰ μš”κ΅¬λŸ‰μ„ κ³„μ‚°ν•˜λ©°, μ²˜λ¦¬λŸ‰μ˜ λΆˆκ·œμΉ™μ„±μ„ μ™„ν™”ν•˜κΈ° μœ„ν•œ 좜λ ₯ λ²„νΌμ˜ 크기λ₯Ό κ³„μ‚°ν•œλ‹€. λͺ…μ„Έλœ νƒœμŠ€ν¬ κ·Έλž˜ν”„μ™€ μŠ€μΌ€μ€„ κ²°κ³Όλ‘œλΆ€ν„°, HOPES ν”„λ ˆμž„μ›Œν¬λŠ” λŒ€μƒ μ•„ν‚€ν…μ²˜λ₯Ό μœ„ν•œ μžλ™ μ½”λ“œ 생성을 μ§€μ›ν•œλ‹€. 이λ₯Ό μœ„ν•΄ μžλ™ μ½”λ“œ μƒμ„±κΈ°λŠ” CIC νƒœμŠ€ν¬ λͺ¨λΈμ˜ ν™•μž₯된 νŠΉμ§•λ“€μ„ μ§€μ›ν•˜λ„λ‘ ν™•μž₯λ˜μ—ˆλ‹€. μ‘μš© μˆ˜μ€€μ—μ„œλŠ” MTM-SDF κ·Έλž˜ν”„λ₯Ό 주어진 정적 μŠ€μΌ€μ€„λ§ κ²°κ³Όλ₯Ό λ”°λ₯΄λŠ” λ©€ν‹°ν”„λ‘œμ„Έμ„œ μ½”λ“œλ₯Ό μƒμ„±ν•˜λ„λ‘ ν™•μž₯λ˜μ—ˆλ‹€. λ˜ν•œ, λ„€ 가지 μ„œλ‘œ λ‹€λ₯Έ μŠ€μΌ€μ€„λ§ μ •μ±… (fully-static, self-timed, static-assignment, fully-dynamic)에 λŒ€ν•œ λ©€ν‹°ν”„λ‘œμ„Έμ„œ μ½”λ“œ 생성을 μ§€μ›ν•œλ‹€. μ‹œμŠ€ν…œ μˆ˜μ€€μ—μ„œλŠ” μ§€μ›ν•˜λŠ” μ‹œμŠ€ν…œ μš”μ²­ API에 λŒ€ν•œ μ‹€μ œ κ΅¬ν˜„ μ½”λ“œλ₯Ό μƒμ„±ν•˜λ©°, 정적 μŠ€μΌ€μ€„ 결과와 νƒœμŠ€ν¬λ“€μ˜ μ œμ–΄ κ°€λŠ₯ν•œ 속성듀에 λŒ€ν•œ 자료 ꡬ쑰 μ½”λ“œλ₯Ό μƒμ„±ν•œλ‹€. 볡수 λͺ¨λ“œ λ©€ν‹°λ―Έλ””μ–΄ 터미널 예제λ₯Ό ν†΅ν•œ 기초적인 μ‹€ν—˜λ“€μ„ 톡해, μ œμ•ˆν•˜λŠ” λ°©λ²•λ‘ μ˜ 타당성을 보인닀.As the number of processors in a chip increases, and more functions are integrated, the system status will change dynamically due to various factors such as the workload variation, QoS requirement, and unexpected component failure. On the other hand, computation-complexity of user applications is also steadily increasingvideo and graphics applications are two major driving forces in smart mobile devices, which define the main application domain of interest in this dissertation. So, a systematic design methodology is highly required to implement such complex systems which contain dynamically changed behavior as well as computation-intensive workload that can be parallelized. A model-based approach is one of representative approaches for parallel embedded software development. Especially, HOPES framework is proposed which is a design environment for parallel embedded software supporting the overall design steps: system specification, performance estimation, design space exploration, and automatic code generation. Distinguished from other design environments, it introduces a novel concept of programming platform, called CIC (Common Intermediate Code) that can be understood as a generic execution model of heterogeneous multiprocessor architecture. The CIC task model is based on a process network model, but it can be refined to the SDF (Synchronous Data Flow) model, since it has a very desirable features for static analyzability as well as parallel processing. However, the SDF model has a typical weakness of expression capability, especially for the system-level specification and dynamically changed behavior of an application. To overcome this weakness, in this dissertation, we propose an extended CIC task model based on dataflow and FSM models to specify the dynamic behavior of the system distinguishing inter- and intra-application dynamism. At the top-level, each application is specified by a dataflow task and the dynamic behavior is modeled as a control task that supervises the execution of applications. Inside a dataflow task, it specifies the dynamic behavior using a similar way as FSM-based SADFan SDF task may have multiple behaviors and a tabular specification of an FSM, called MTM (Mode Transition Machine), describes the mode transition rules for the SDF graph. We call it to MTM-SDF model which is classified as multi-mode dataflow models in the dissertation. It assumes that an application has a finite number of behaviors (or modes) and each behavior (mode) is represented by an SDF graph. It enables us to perform compile-time scheduling of each graph to maximize the throughput varying the number of allocated processors, and store the scheduling information. Also, a multiprocessor scheduling technique is proposed for a multi-mode dataflow graph. While there exist several scheduling techniques for multi-mode dataflow models, no one allows task migration between modes. By observing that the resource requirement can be additionally reduced if task migration is allowed, we propose a multiprocessor scheduling technique of a multi-mode dataflow graph considering task migration between modes. Based on a genetic algorithm, the proposed technique schedules all SDF graphs in all modes simultaneously to minimize the resource requirement. To satisfy the throughput constraint, the proposed technique calculates the actual throughput requirement of each mode and the output buffer size for tolerating throughput jitter. For the specified task graph and scheduling results, the CIC translator generates parallelized code for the target architecture. Therefore the CIC translator is extended to support extended features of the CIC task model. In application-level, it is extended to support multiprocessor code generation for an MTM-SDF graph considering the given static scheduling results. Also, multiprocessor code generation of four different scheduling policies are supported for an MTM-SDF graph: fully-static, self-timed, static-assignment, and fully-dynamic. In system-level, the CIC translator is extended to support code generation for implementation of system request APIs and data structures for the static scheduling results and configurable task parameters. Through preliminary experiments with a multi-mode multimedia terminal example, the viability of the proposed methodology is verified.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Contribution 7 1.3 Dissertation organization 9 Chapter 2 Background 10 2.1 Related work 10 2.1.1 Compiler-based approach 10 2.1.2 Language-based approach 11 2.1.3 Model-based approach 15 2.2 HOPES framework 19 2.3 Common Intermediate Code (CIC) Model 21 Chapter 3 Dynamic Behavior Specification 26 3.1 Problem definition 26 3.1.1 System-level dynamic behavior 26 3.1.2 Application-level dynamic behavior 27 3.2 Related work 28 3.3 Motivational example 31 3.4 Control task specification for system-level dynamism 33 3.4.1 Internal specification 33 3.4.2 Action scripts 38 3.5 MTM-SDF specification for application-level dynamism 44 3.5.1 MTM specification 44 3.5.2 Task graph specification 45 3.5.3 Execution semantic of an MTM-SDF graph 46 Chapter 4 Multiprocessor Scheduling of an Multi-mode Dataflow Graph 50 4.1 Related work 51 4.2 Motivational example 56 4.2.1 Throughput requirement calculation considering mode transition delay 56 4.2.2 Task migration between mode transition 58 4.3 Problem definition 61 4.4 Throughput requirement analysis 65 4.4.1 Mode transition delay 66 4.4.2 Arrival curves of the output buffer 70 4.4.3 Buffer size determination 71 4.4.4 Throughput requirement analysis 73 4.5 Proposed MMDF scheduling framework 75 4.5.1 Optimization problem 75 4.5.2 GA configuration 76 4.5.3 Fitness function 78 4.5.4 Local optimization technique 79 4.6 Experimental results 81 4.6.1 MMDF scheduling technique 83 4.6.2 Scalability of the Proposed Framework 88 Chapter 5 Multiprocessor Code Generation for the Extended CIC Model 89 5.1 CIC translator 89 5.2 Code generation for application-level dynamism 91 5.2.1 Function call-style code generation (fully-static, self-timed) 94 5.2.2 Thread-style code generation (static-assignment, fully-dynamic) 98 5.3 Code generation for system-level dynamism 101 5.4 Experimental results 105 Chapter 6 Conclusion and Future Work 107 Bibliography 109 초둝 125Docto
    • …
    corecore