32 research outputs found

    Resource Management in Mobile Edge Computing for Compute-intensive Application

    Full text link
    With current and future mobile applications (e.g., healthcare, connected vehicles, and smart grids) becoming increasingly compute-intensive for many mission-critical use cases, the energy and computing capacities of embedded mobile devices are proving to be insufficient to handle all in-device computation. To address the energy and computing shortages of mobile devices, mobile edge computing (MEC) has emerged as a major distributed computing paradigm. Compared to traditional cloud-based computing, MEC integrates network control, distributed computing, and storage to customizable, fast, reliable, and secure edge services that are closer to the user and data sites. However, the diversity of applications and a variety of user specified requirements (viz., latency, scalability, availability, and reliability) add additional complications to the system and application optimization problems in terms of resource management. In this thesis dissertation, we aim to develop customized and intelligent placement and provisioning strategies that are needed to handle edge resource management problems for different challenging use cases: i) Firstly, we propose an energy-efficient framework to address the resource allocation problem of generic compute-intensive applications, such as Directed Acyclic Graph (DAG) based applications. We design partial task offloading and server selection strategies with the purpose of minimizing the transmission cost. Our experiment and simulation results indicate that partial task offloading provides considerable energy savings, especially for resource-constrained edge systems. ii) Secondly, to address the dynamism edge environments, we propose solutions that integrate Dynamic Spectrum Access (DSA) and Cooperative Spectrum Sensing (CSS) with fine-grained task offloading schemes. Similarly, we show the high efficiency of the proposed strategy in capturing dynamic channel states and enforcing intelligent channel sensing and task offloading decisions. iii) Finally, application-specific long-term optimization frameworks are proposed for two representative applications: a) multi-view 3D reconstruction and b) Deep Neural Network (DNN) inference. Here, in order to eliminate redundant and unnecessary reconstruction processing, we introduce key-frame and resolution selection incorporated with task assignment, quality prediction, and pipeline parallelization. The proposed framework is able to provide a flexible balance between reconstruction time and quality satisfaction. As for DNN inference, a joint resource allocation and DNN partitioning framework is proposed. The outcomes of this research seek to benefit the future distributed computing, smart applications, and data-intensive science communities to build effective, efficient, and robust MEC environments

    WCET and Priority Assignment Analysis of Real-Time Systems using Search and Machine Learning

    Get PDF
    Real-time systems have become indispensable for human life as they are used in numerous industries, such as vehicles, medical devices, and satellite systems. These systems are very sensitive to violations of their time constraints (deadlines), which can have catastrophic consequences. To verify whether the systems meet their time constraints, engineers perform schedulability analysis from early stages and throughout development. However, there are challenges in obtaining precise results from schedulability analysis due to estimating the worst-case execution times (WCETs) and assigning optimal priorities to tasks. Estimating WCET is an important activity at early design stages of real-time systems. Based on such WCET estimates, engineers make design and implementation decisions to ensure that task executions always complete before their specified deadlines. However, in practice, engineers often cannot provide a precise point of WCET estimates and they prefer to provide plausible WCET ranges. Task priority assignment is an important decision, as it determines the order of task executions and it has a substantial impact on schedulability results. It thus requires finding optimal priority assignments so that tasks not only complete their execution but also maximize the safety margins from their deadlines. Optimal priority values increase the tolerance of real-time systems to unexpected overheads in task executions so that they can still meet their deadlines. However, it is a hard problem to find optimal priority assignments because their evaluation relies on uncertain WCET values and complex engineering constraints must be accounted for. This dissertation proposes three approaches to estimate WCET and assign optimal priorities at design stages. Combining a genetic algorithm and logistic regression, we first suggest an automatic approach to infer safe WCET ranges with a probabilistic guarantee based on the worst-case scheduling scenarios. We then introduce an extended approach to account for weakly hard real-time systems with an industrial schedule simulator. We evaluate our approaches by applying them to industrial systems from different domains and several synthetic systems. The results suggest that they are possible to estimate probabilistic safe WCET ranges efficiently and accurately so the deadline constraints are likely to be satisfied with a high degree of confidence. Moreover, we propose an automated technique that aims to identify the best possible priority assignments in real-time systems. The approach deals with multiple objectives regarding safety margins and engineering constraints using a coevolutionary algorithm. Evaluation with synthetic and industrial systems shows that the approach significantly outperforms both a baseline approach and solutions defined by practitioners. All the solutions in this dissertation scale to complex industrial systems for offline analysis within an acceptable time, i.e., at most 27 hours

    Global EDF ์Šค์ผ€์ค„๋ง์„ ์œ„ํ•œ ์‹ค์‹œ๊ฐ„ ํƒœ์Šคํฌ์˜ ์กฐ๊ฑด๋ถ€ ์ตœ์  ๋ณ‘๋ ฌํ™”

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2022.2. ์ด์ฐฝ๊ฑด.Real-time applications are rapidly growing in their size and complexity. This trend is apparent even in extreme deadline-critical sectors such as autonomous driving and artificial intelligence. Such complex program models are described by a directed-acyclic-graph (DAG), where each node of the graph represents a task, and the connected edges portray the precedence relation between the tasks. With this intricate structure and intensive computation requirements, extensive system-wide optimization is required to ensure a stable real-time execution. On the other hand, recent advances in parallel computing frameworks, such as OpenCL and OpenMP, allow us to parallelize a real-time task into many different versions, which is called ``parallelization freedom.'' Depending on the degree of parallelization, the thread execution times can vary significantly, that is, more parallelization tends to reduce each thread execution time but increase the total execution time due to parallelization overhead. By carefully selecting a ``parallelization option'' for each task, i.e., the chosen number of threads each task is parallelized into, we can maximize the system schedulability while satisfying real-time constraints. Because of this benefit, parallelization freedom has drawn recent attention. However, for global EDF scheduling, G-EDF for short, the concept of parallelization freedom still has not brought much attention. To this extent, this dissertation proposes a way of optimally assigning parallelization options to real-time tasks for G-EDF on a multi-core system. Moreover, we aim to propose a polynomial-time algorithm that can be used for online situations where tasks dynamically join and leave. We formalize a monotonic increasing property of both tolerance and interference to the parallelization option to achieve this. Using such properties, we develop a uni-directional search algorithm that can assign parallelization options in polynomial time, which we formally prove the optimality. With the optimal parallelization, we observe significant improvement of schedulability through simulation experiment, and then in the following implementation experiment, we demonstrate that the algorithm is practically applicable for real-world use-cases. This dissertation first focuses on the traditional task model, i.e., multi-thread task model, then extends also to target the multi-segment (MS) task model and finally discusses the more general directed-acyclic-graph (DAG) task model to accommodate a wide range of real-world computing models.์ด ๋…ผ๋ฌธ์€ Global EDF ์Šค์ผ€์ค„๋Ÿฌ์—์„œ์˜ ์‹ค์‹œ๊ฐ„ ํƒœ์Šคํฌ์˜ ์ตœ์  ๋ณ‘๋ ฌํ™” ๊ธฐ๋ฒ•์— ๋Œ€ํ•ด ๊ธฐ์ˆ ํ•œ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๋ณ‘๋ ฌํ™” ์˜ต์…˜ ์ฆ๊ฐ€์— ๋Œ€ํ•ด tolerance์™€ interference๊ฐ€ ๋‹จ์กฐ ์ฆ๊ฐ€ํ•จ์„ ์ˆ˜ํ•™์ ์œผ๋กœ ๋ฐํžŒ๋‹ค. ์ด๋Ÿฌํ•œ ํŠน์„ฑ์„ ์ด์šฉํ•˜์—ฌ, ๋‹คํ•ญ์‹ ์‹œ๊ฐ„์•ˆ์— ์ˆ˜ํ–‰๋  ์ˆ˜ ์žˆ๋Š” ์‹ค์‹œ๊ฐ„ ํƒœ์Šคํฌ์˜ ๋ณ‘๋ ฌํ™” ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•˜๊ณ , ์ˆ˜ํ•™์ ์œผ๋กœ ์ตœ์ ์˜ ๋ฐฉ๋ฒ•์ž„์„ ์ฆ๋ช…ํ•œ๋‹ค. ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์‹คํ—˜์„ ํ†ตํ•ด, ์‹ค์‹œ๊ฐ„ ํƒœ์Šคํฌ์˜ ์Šค์ผ€์ค„๋ง ์„ฑ๋Šฅ์˜ ๋น„์•ฝ์ ์ธ ์ƒ์Šน์„ ๊ด€์ฐฐํ•˜๊ณ , ์ด์–ด์ง€๋Š” ์‹ค์ œ ์„ธ๊ณ„ ์›Œํฌ๋กœ๋“œ๋ฅผ ์ด์šฉํ•œ ๊ตฌํ˜„ ์‹คํ—˜์—์„œ ์‹ค์ œ ์„ธ๊ณ„ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ์„ ์ ๊ฒ€ํ•œ๋‹ค. ์ด ๋…ผ๋ฌธ์€ ๋จผ์ € ์ „ํ†ต์ ์ธ ๋ฉ€ํ‹ฐ์“ฐ๋ ˆ๋“œ ํƒœ์Šคํฌ ๋ชจ๋ธ์„ ๋Œ€์ƒ์œผ๋กœ ์ตœ์  ๋ณ‘๋ ฌํ™” ๋ฐฉ๋ฒ•์„ ๋…ผํ•˜๋ฉฐ, ์ดํ›„ ๋ฉ€ํ‹ฐ ์„ธ๊ทธ๋จผํŠธ, DAG ํƒœ์Šคํฌ ๋ชจ๋ธ๊นŒ์ง€ ํ™•์žฅํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๊ธฐ์ˆ ํ•œ๋‹ค.1. Introduction 1 1. 1 Motivation and Objective 1 1. 2 Approach 3 1. 3 Organization 4 2. Related Work 5 2. 1 Real-Time Multi-Core Scheduling 5 2. 2 Real-Time Multi-Core Task Model 6 2. 3 Real-Time Multi-Core Schedulability Analysis 7 3. Optimal Parallelization of Multi-Thread Tasks 9 3. 1 Introduction 9 3. 2 Problem Description 10 3. 3 Extension of BCL Schedulability Analysis 12 3. 3. 1 Overview of BCL Schedulability Analysis 13 3. 3. 2 Properties of Parallelization Freedom 17 3. 4 Optimal Assignment of Parallelization Options 30 3. 4. 1 Optimal Parallelization Assignment Algorithm 33 3. 4. 2 Optimality of Algorithm1 35 3. 4. 3 Time Complexity of Algorithm1 38 3. 5 Experiment Results 38 3. 5. 1 Simulation Results 39 3. 5. 2 Simulated Schedule Results 43 3. 5. 3 Survey on the Boundary Condition of the Parallelization Freedom 45 3. 5. 4 Autonomous Driving Task Implementation Results 48 4. Conditionally Optimal Parallelization of Multi-Segment and DAG Tasks 56 4. 1 Introduction 56 4. 2 Multi-Segment Task Model 58 4. 3 Extension of Chwa-MS Schedulability Analysis 60 4. 3. 1 Chwa-MS Schedulability Analysis 60 4. 3. 2 Tolerance and Interference of Multi-Segment Tasks 62 4. 4 Assigning Parallelization Options to Multi-Segments 63 4. 4. 1 Parallelization Route 63 4. 4. 2 Assigning Parallelization Options to Multi-Segment Tasks 66 4. 4. 3 Time complexity of Algorithm2 69 4.5 DAG (Directed Acyclic Graph) Task Model 69 4. 6 Extension of Chwa-DAG Schedulability Analysis 73 4. 6. 1 Chwa-DAG Schedulability Analysis 73 4. 6. 2 Tolerance and Interference of DAG Tasks 78 4. 7 Assigning Parallelization Options to DAG Tasks 87 4. 7. 1 Parallelization Route for DAG Task Model 87 4. 7. 2 Assigning Parallelization Options to DAG Tasks 90 4. 7. 3 Time Complexity of Algorithm3 91 4. 8 Experiment Results: Multi-Segment Task Model 93 4. 9 Experiment Results: DAG Task Model 96 4. 9. 1 Simulation Results 97 4. 9. 2 Implementation Results 100 5 Conclusion 104 5. 1 Summary 104 5. 2 Future Work 105 6. References 106๋ฐ•

    Schedulability analysis and optimization of time-partitioned distributed real-time systems

    Get PDF
    RESUMEN: La creciente complejidad de los sistemas de control modernos lleva a muchas empresas a tener que re-dimensionar o re-diseรฑar sus soluciones para adecuarlas a nuevas funcionalidades y requisitos. Un caso paradigmรกtico de esta situaciรณn se ha dado en el sector ferroviario, donde la implementaciรณn de las aplicaciones de seรฑalizaciรณn se ha llevado a cabo empleando tรฉcnicas tradicionales que, si bien ahora mismo cumplen con los requisitos bรกsicos, su rendimiento temporal y escalabilidad funcional son sustancialmente mejorables. A partir de las soluciones propuestas en esta tesis, ademรกs de contribuir a la validaciรณn de sistemas que requieren certificaciรณn de seguridad funcional, tambiรฉn se crearรก la tecnologรญa base de anรกlisis de planificabilidad y optimizaciรณn de sistemas de tiempo real distribuidos generales y tambiรฉn basados en particionado temporal, que podrรก ser aplicada en distintos entornos en los que los sistemas ciberfรญsicos juegan un rol clave, por ejemplo en aplicaciones de Industria 4.0, en los que pueden presentarse problemas similares en el futuro.ABSTRACT:he increasing complexity of modern control systems leads many companies to have to resize or redesign their solutions to adapt them to new functionalities and requirements. A paradigmatic case of this situation has occurred in the railway sector, where the implementation of signaling applications has been carried out using traditional techniques that, although they currently meet the basic requirements, their time performance and functional scalability can be substantially improved. From the solutions proposed in this thesis, besides contributing to the assessment of systems that require functional safety certification, the base technology for schedulability analysis and optimization of general as well as time-partitioned distributed real-time systems will be derived, which can be applied in different environments where cyber-physical systems play a key role, for example in Industry 4.0 applications, where similar problems may arise in the future

    Computer Aided Verification

    Get PDF
    The open access two-volume set LNCS 11561 and 11562 constitutes the refereed proceedings of the 31st International Conference on Computer Aided Verification, CAV 2019, held in New York City, USA, in July 2019. The 52 full papers presented together with 13 tool papers and 2 case studies, were carefully reviewed and selected from 258 submissions. The papers were organized in the following topical sections: Part I: automata and timed systems; security and hyperproperties; synthesis; model checking; cyber-physical systems and machine learning; probabilistic systems, runtime techniques; dynamical, hybrid, and reactive systems; Part II: logics, decision procedures; and solvers; numerical programs; verification; distributed systems and networks; verification and invariants; and concurrency

    Sharing GPUs for Real-Time Autonomous-Driving Systems

    Get PDF
    Autonomous vehicles at mass-market scales are on the horizon. Cameras are the least expensive among common sensor types and can preserve features such as color and texture that other sensors cannot. Therefore, realizing full autonomy in vehicles at a reasonable cost is expected to entail computer-vision techniques. These computer-vision applications require massive parallelism provided by the underlying shared accelerators, such as graphics processing units, or GPUs, to function โ€œin real time.โ€ However, when computer-vision researchers and GPU vendors refer to โ€œreal time,โ€ they usually mean โ€œreal fastโ€; in contrast, certifiable automotive systems must be โ€œreal timeโ€ in the sense of being predictable. This dissertation addresses the challenging problem of how GPUs can be shared predictably and efficiently for real-time autonomous-driving systems. We tackle this challenge in four steps. First, we investigate NVIDIA GPUs with respect to scheduling, synchronization, and execution. We conduct an extensive set of experiments to infer NVIDIA GPU scheduling rules, which are unfortunately undisclosed by NVIDIA and are beyond access owing to their closed-source software stack. We also expose a list of pitfalls pertaining to CPU-GPU synchronization that can result in unbounded response times of GPU-using applications. Lastly, we examine a fundamental trade-off for designing real-time tasks under different execution options. Overall, our investigation provides an essential understanding of NVIDIA GPUs, allowing us to further model and analyze GPU tasks. Second, we develop a new model and conduct schedulability analysis for GPU tasks. We extend the well-studied sporadic task model with additional parameters that characterize the parallel execution of GPU tasks. We show that NVIDIA scheduling rules are subject to fundamental capacity loss, which implies a necessary total utilization bound. We derive response-time bounds for GPU task systems that satisfy our schedulability conditions. Third, we address an industrial challenge of supplying the throughput performance of computer-vision frameworks to support adequate coverage and redundancy offered by an array of cameras. We re-think the design of convolution neural network (CNN) software to better utilize hardware resources and achieve increased throughput (number of simultaneous camera streams) without any appreciable increase in per-frame latency (camera to CNN output) or reduction of per-stream accuracy. Fourth, we apply our analysis to a finer-grained graph scheduling of a computer-vision standard, OpenVX, which explicitly targets embedded and real-time systems. We evaluate both the analytical and empirical real-time performance of our approach.Doctor of Philosoph

    Response-Time Analysis of Limited-Preemptive Parallel DAG Tasks Under Global Scheduling

    Get PDF
    Most recurrent real-time applications can be modeled as a set of sequential code segments (or blocks) that must be (repeatedly) executed in a specific order. This paper provides a schedulability analysis for such systems modeled as a set of parallel DAG tasks executed under any limited-preemptive global job-level fixed priority scheduling policy. More precisely, we derive response-time bounds for a set of jobs subject to precedence constraints, release jitter, and execution-time uncertainty, which enables support for a wide variety of parallel, limited-preemptive execution models (e.g., periodic DAG tasks, transactional tasks, generalized multi-frame tasks, etc.). Our analysis explores the space of all possible schedules using a powerful new state abstraction and state-pruning technique. An empirical evaluation shows the analysis to identify between 10 to 90 percentage points more schedulable task sets than the state-of-the-art schedulability test for limited-preemptive sporadic DAG tasks. It scales to systems of up to 64 cores with 20 DAG tasks. Moreover, while our analysis is almost as accurate as the state-of-the-art exact schedulability test based on model checking (for sequential non-preemptive tasks), it is three orders of magnitude faster and hence capable of analyzing task sets with more than 60 tasks on 8 cores in a few seconds

    Fault-tolerant satellite computing with modern semiconductors

    Get PDF
    Miniaturized satellites enable a variety space missions which were in the past infeasible, impractical or uneconomical with traditionally-designed heavier spacecraft. Especially CubeSats can be launched and manufactured rapidly at low cost from commercial components, even in academic environments. However, due to their low reliability and brief lifetime, they are usually not considered suitable for life- and safety-critical services, complex multi-phased solar-system-exploration missions, and missions with a longer duration. Commercial electronics are key to satellite miniaturization, but also responsible for their low reliability: Until 2019, there existed no reliable or fault-tolerant computer architectures suitable for very small satellites. To overcome this deficit, a novel on-board-computer architecture is described in this thesis.Robustness is assured without resorting to radiation hardening, but through software measures implemented within a robust-by-design multiprocessor-system-on-chip. This fault-tolerant architecture is component-wise simple and can dynamically adapt to changing performance requirements throughout a mission. It can support graceful aging by exploiting FPGA-reconfiguration and mixed-criticality.ย  Experimentally, we achieve 1.94W power consumption at 300Mhz with a Xilinx Kintex Ultrascale+ proof-of-concept, which is well within the powerbudget range of current 2U CubeSats. To our knowledge, this is the first COTS-based, reproducible on-board-computer architecture that can offer strong fault coverage even for small CubeSats.European Space AgencyComputer Systems, Imagery and Medi

    Effective And Efficient Preemption Placement For Cache Overhead Minimization In Hard Real-Time Systems

    Get PDF
    Schedulability analysis for real-time systems has been the subject of prominent research over the past several decades. One of the key foundations of schedulability analysis is an accurate worst case execution time (WCET) for each task. In preemption based real-time systems, the CRPD can represent a significant component (up to 44% as documented in research literature) of variability to overall task WCET. Several methods have been employed to calculate CRPD with significant levels of pessimism that may result in a task set erroneously declared as non-schedulable. Furthermore, they do not take into account that CRPD cost is inherently a function of where preemptions actually occur. Our approach for computing CRPD via loaded cache blocks (LCBs) is more accurate in the sense that cache state reflects which cache blocks and the specific program locations where they are reloaded. Limited preemption models attempt to minimize preemption overhead (CRPD) by reducing the number of allowed preemptions and/or allowing preemption at program locations where the CRPD effect is minimized. These algorithms rely heavily on accurate CRPD measurements or estimation models in order to identify an optimal set of preemption points. Our approach improves the effectiveness of limited optimal preemption point placement algorithms by calculating the LCBs for each pair of adjacent preemptions to more accurately model task WCET and maximize schedulability as compared to existing preemption point placement approaches. We utilize dynamic programming technique to develop an optimal preemption point placement algorithm. Lastly, we will demonstrate, using a case study, improved task set schedulability and optimal preemption point placement via our new LCB characterization. We propose a new CRPD metric, called loaded cache blocks (LCB) which accurately characterizes the CRPD a real-time task may be subjected to due to the preemptive execution of higher priority tasks. We show how to integrate our new LCB metric into our newly developed algorithms that automatically place preemption points supporting linear control flow graphs (CFGs) for limited preemption scheduling applications. We extend the derivation of loaded cache blocks (LCB), that was proposed for linear control flow graphs (CFGs) to conditional CFGs. We show how to integrate our revised LCB metric into our newly developed algorithms that automatically place preemption points supporting conditional control flow graphs (CFGs) for limited preemption scheduling applications. For future work, we will verify the correctness of our framework through other measurable physical and hardware constraints. Also, we plan to complete our work on developing a generalized framework that can be seamlessly integrated into real-time schedulability analysis

    A Response-Time Analysis for Non-Preemptive Job Sets under Global Scheduling

    Get PDF
    An effective way to increase the timing predictability of multicore platforms is to use non-preemptive scheduling. It reduces preemption and job migration overheads, avoids intra-core cache interference, and improves the accuracy of worst-case execution time (WCET) estimates. However, existing schedulability tests for global non-preemptive multiprocessor scheduling are pessimistic, especially when applied to periodic workloads. This paper reduces this pessimism by introducing a new type of sufficient schedulability analysis that is based on an exploration of the space of possible schedules using concise abstractions and state-pruning techniques. Specifically, we analyze the schedulability of non-preemptive job sets (with bounded release jitter and execution time variation) scheduled by a global job-level fixed-priority (JLFP) scheduling algorithm upon an identical multicore platform. The analysis yields a lower bound on the best-case response-time (BCRT) and an upper bound on the worst-case response time (WCRT) of the jobs. In an empirical evaluation with randomly generated workloads, we show that the method scales to 30 tasks, a hundred thousand jobs (per hyperperiod), and up to 9 cores
    corecore