393 research outputs found

    MARACAS: a real-time multicore VCPU scheduling framework

    Full text link
    This paper describes a multicore scheduling and load-balancing framework called MARACAS, to address shared cache and memory bus contention. It builds upon prior work centered around the concept of virtual CPU (VCPU) scheduling. Threads are associated with VCPUs that have periodically replenished time budgets. VCPUs are guaranteed to receive their periodic budgets even if they are migrated between cores. A load balancing algorithm ensures VCPUs are mapped to cores to fairly distribute surplus CPU cycles, after ensuring VCPU timing guarantees. MARACAS uses surplus cycles to throttle the execution of threads running on specific cores when memory contention exceeds a certain threshold. This enables threads on other cores to make better progress without interference from co-runners. Our scheduling framework features a novel memory-aware scheduling approach that uses performance counters to derive an average memory request latency. We show that latency-based memory throttling is more effective than rate-based memory access control in reducing bus contention. MARACAS also supports cache-aware scheduling and migration using page recoloring to improve performance isolation amongst VCPUs. Experiments show how MARACAS reduces multicore resource contention, leading to improved task progress.http://www.cs.bu.edu/fac/richwest/papers/rtss_2016.pdfAccepted manuscrip

    Semi-Partitioned Scheduling of Dynamic Real-Time Workload: A Practical Approach Based on Analysis-Driven Load Balancing

    Get PDF
    Recent work showed that semi-partitioned scheduling can achieve near-optimal schedulability performance, is simpler to implement compared to global scheduling, and less heavier in terms of runtime overhead, thus resulting in an excellent choice for implementing real-world systems. However, semi-partitioned scheduling typically leverages an off-line design to allocate tasks across the available processors, which requires a-priori knowledge of the workload. Conversely, several simple global schedulers, as global earliest-deadline first (G-EDF), can transparently support dynamic workload without requiring a task-allocation phase. Nonetheless, such schedulers exhibit poor worst-case performance. This work proposes a semi-partitioned approach to efficiently schedule dynamic real-time workload on a multiprocessor system. A linear-time approximation for the C=D splitting scheme under partitioned EDF scheduling is first presented to reduce the complexity of online scheduling decisions. Then, a load-balancing algorithm is proposed for admitting new real-time workload in the system with limited workload re-allocation. A large-scale experimental study shows that the linear-time approximation has a very limited utilization loss compared to the exact technique and the proposed approach achieves very high schedulability performance, with a consistent improvement on G-EDF and pure partitioned EDF scheduling

    Semi-Partitioned Scheduling of Dynamic Real-Time Workload: A Practical Approach Based on Analysis-Driven Load Balancing

    Get PDF
    Recent work showed that semi-partitioned scheduling can achieve near-optimal schedulability performance, is simpler to implement compared to global scheduling, and less heavier in terms of runtime overhead, thus resulting in an excellent choice for implementing real-world systems. However, semi-partitioned scheduling typically leverages an off-line design to allocate tasks across the available processors, which requires a-priori knowledge of the workload. Conversely, several simple global schedulers, as global earliest-deadline first (G-EDF), can transparently support dynamic workload without requiring a task-allocation phase. Nonetheless, such schedulers exhibit poor worst-case performance. This work proposes a semi-partitioned approach to efficiently schedule dynamic real-time workload on a multiprocessor system. A linear-time approximation for the C=D splitting scheme under partitioned EDF scheduling is first presented to reduce the complexity of online scheduling decisions. Then, a load-balancing algorithm is proposed for admitting new real-time workload in the system with limited workload re-allocation. A large-scale experimental study shows that the linear-time approximation has a very limited utilization loss compared to the exact technique and the proposed approach achieves very high schedulability performance, with a consistent improvement on G-EDF and pure partitioned EDF scheduling

    A framework for hierarchical scheduling on multiprocessors: from application requirements to run-time allocation

    Get PDF
    Hierarchical scheduling is a promising methodology for designing and deploying real-time applications, since it enables component-based design and analysis, and supports temporal isolation among competing applications. In hierarchical scheduling an application is described by means of a temporal interface. The designer faces the problem of how to derive the interface parameters so to make the application schedulable, at the same time minimizing the waste of computational resources. The problem is particularly relevant in multiprocessor systems, where it is not clear yet how the interface parameters influence the schedulability of the application and allocation on the physical platform. In this paper we present three novel contributions to hierarchical scheduling for multiprocessor systems. First, we propose the Bounded-Delay Multipartition (BDM), a new interface specification model that allows the designer to balance resource usage versus flexibility in selecting the virtual platform parameters. Second, we explore the schedulability region of a real-time application on top of a generic virtual platform, and derive the interface parameter. Finally, we propose Fluid Best-Fit, an algorithm that takes advantage of the extra degree of flexibility provided by the BDM to compute the virtual platform parameters and allocate it on the physical platform. The performance of the algorithm is evaluated by simulations

    A Framework for Hierarchical Scheduling on Multiprocessors: From Application Requirements to Run-Time Allocation

    Get PDF
    Hierarchical scheduling is a promising methodology for designing and deploying real-time applications, since it enables component-based design and analysis, and supports temporal isolation among competing applications. In hierarchical scheduling an application is described by means of a temporal interface. The designer faces the problem of how to derive the interface parameters so to make the application schedulable, at the same time minimizing the waste of computational resources. The problem is particularly relevant in multiprocessor systems, where it is not clear yet how the interface parameters influence the schedulability of the application and allocation on the physical platform. In this paper we present three novel contributions to hierarchical scheduling for multiprocessor systems. First, we propose the Bounded-Delay Multipartition (BDM), a new interface specification model that allows the designer to balance resource usage versus flexibility in selecting the virtual platform parameters. Second, we explore the schedulability region of a real-time application on top of a generic virtual platform, and derive the interface parameter. Finally, we propose Fluid Best-Fit, an algorithm that takes advantage of the extra degree of flexibility provided by the BDM to compute the virtual platform parameters and allocate it on the physical platform. The performance of the algorithm is evaluated by simulations

    A C-DAG task model for scheduling complex real-time tasks on heterogeneous platforms: preemption matters

    Full text link
    Recent commercial hardware platforms for embedded real-time systems feature heterogeneous processing units and computing accelerators on the same System-on-Chip. When designing complex real-time application for such architectures, the designer needs to make a number of difficult choices: on which processor should a certain task be implemented? Should a component be implemented in parallel or sequentially? These choices may have a great impact on feasibility, as the difference in the processor internal architectures impact on the tasks' execution time and preemption cost. To help the designer explore the wide space of design choices and tune the scheduling parameters, in this paper we propose a novel real-time application model, called C-DAG, specifically conceived for heterogeneous platforms. A C-DAG allows to specify alternative implementations of the same component of an application for different processing engines to be selected off-line, as well as conditional branches to model if-then-else statements to be selected at run-time. We also propose a schedulability analysis for the C-DAG model and a heuristic allocation algorithm so that all deadlines are respected. Our analysis takes into account the cost of preempting a task, which can be non-negligible on certain processors. We demonstrate the effectiveness of our approach on a large set of synthetic experiments by comparing with state of the art algorithms in the literature

    A Survey of Research into Mixed Criticality Systems

    Get PDF
    This survey covers research into mixed criticality systems that has been published since Vestal’s seminal paper in 2007, up until the end of 2016. The survey is organised along the lines of the major research areas within this topic. These include single processor analysis (including fixed priority and EDF scheduling, shared resources and static and synchronous scheduling), multiprocessor analysis, realistic models, and systems issues. The survey also explores the relationship between research into mixed criticality systems and other topics such as hard and soft time constraints, fault tolerant scheduling, hierarchical scheduling, cyber physical systems, probabilistic real-time systems, and industrial safety standards

    Analysis and implementation of the multiprocessor bandwidth inheritance protocol

    Get PDF
    The Multiprocessor Bandwidth Inheritance (M-BWI) protocol is an extension of the Bandwidth Inheritance (BWI) protocol for symmetric multiprocessor systems. Similar to Priority Inheritance, M-BWI lets a task that has locked a resource execute in the resource reservations of the blocked tasks, thus reducing their blocking time. The protocol is particularly suitable for open systems where different kinds of tasks dynamically arrive and leave, because it guarantees temporal isolation among independent subsets of tasks without requiring any information on their temporal parameters. Additionally, if the temporal parameters of the interacting tasks are known, it is possible to compute an upper bound to the interference suffered by a task due to other interacting tasks. Thus, it is possible to provide timing guarantees for a subset of interacting hard real-time tasks. Finally, the M-BWI protocol is neutral to the underlying scheduling policy: it can be implemented in global, clustered and semi-partitioned scheduling. After introducing the M-BWI protocol, in this paper we formally prove its isolation properties, and propose an algorithm to compute an upper bound to the interference suffered by a task. Then, we describe our implementation of the protocol for the LITMUS RT real-time testbed, and measure its overhead. Finally, we compare M-BWI against FMLP and OMLP, two other protocols for resource sharing in multiprocessor systems

    Power-aware scheduling with effective task migration for real-time multicore embedded systems

    Full text link
    A major design issue in embedded systems is reducing the power consumption because batteries have a limited energy budget. For this purpose, several techniques such as dynamic voltage and frequency scaling (DVFS) or task migration are being used. DVFS allows reducing power by selecting the optimal voltage supply, whereas task migration achieves this effect by balancing the workload among cores. This paper focuses on power-aware scheduling allowing task migration to reduce energy consumption in multicore embedded systems implementing DVFS capabilities. To address energy savings, the devised schedulers follow two main rules: migrations are allowed at specific points of time and only one task is allowed to migrate each time. Two algorithms have been proposed working under real-time constraints. The simpler algorithm, namely, single option migration (SOM) only checks just one target core before performing a migration. In contrast, the multiple option migration (MOM) searches the optimal target core. In general, the MOM algorithm achieves better energy savings than the SOM algorithm, although differences are wider for a reduced number of cores and frequency/voltage levels. Moreover, the MOM algorithm reduces energy consumption as much as 40% over the worst fit algorithm.This work was supported by the Spanish MICINN, Consolider Programme and Plan E funds, as well as European Commission FEDER funds, under Grants CSD2006-00046 and TIN2009-14475-C04-01.March Cabrelles, JL.; Sahuquillo Borrás, J.; Petit Martí, SV.; Hassan Mohamed, H.; Duato Marín, JF. (2013). Power-aware scheduling with effective task migration for real-time multicore embedded systems. Concurrency and Computation: Practice and Experience. 25(14):1987-2001. doi:10.1002/cpe.2899S198720012514Euiseong Seo, Jinkyu Jeong, Seonyeong Park, & Joonwon Lee. (2008). Energy Efficient Scheduling of Real-Time Tasks on Multicore Processors. IEEE Transactions on Parallel and Distributed Systems, 19(11), 1540-1552. doi:10.1109/tpds.2008.104March, J. L., Sahuquillo, J., Hassan, H., Petit, S., & Duato, J. (2011). A New Energy-Aware Dynamic Task Set Partitioning Algorithm for Soft and Hard Embedded Real-Time Systems. The Computer Journal, 54(8), 1282-1294. doi:10.1093/comjnl/bxr008AlEnawy, T. A., & Aydin, H. (s. f.). Energy-Aware Task Allocation for Rate Monotonic Scheduling. 11th IEEE Real Time and Embedded Technology and Applications Symposium. doi:10.1109/rtas.2005.20Intel atom processor microarchitecture www.intel.com/Marvell ARMADA TM 628 Marvell Semiconductor, Inc. Santa Clara, CA, USA http://www.marvell.com/company/press_kit/assets/Marvell_ARMADA_628_Release_FINAL3.pdfMcNairy, C., & Bhatia, R. (2005). Montecito: A Dual-Core, Dual-Thread Itanium Processor. IEEE Micro, 25(2), 10-20. doi:10.1109/mm.2005.34Kalla, R., Sinharoy, B., & Tendler, J. M. (2004). IBM power5 chip: a dual-core multithreaded processor. IEEE Micro, 24(2), 40-47. doi:10.1109/mm.2004.1289290Shah A Arm plans to add multithreading to chip design 2010 http://www.itworld.com/hardware/122383/arm-plans-add-multithreading-chip-designSchranzhofer, A., Chen, J.-J., & Thiele, L. (2010). Dynamic Power-Aware Mapping of Applications onto Heterogeneous MPSoC Platforms. IEEE Transactions on Industrial Informatics, 6(4), 692-707. doi:10.1109/tii.2010.2062192Cazorla, F. J., Knijnenburg, P. M. W., Sakellariou, R., Fernandez, E., Ramirez, A., & Valero, M. (2006). Predictable performance in SMT processors: synergy between the OS and SMTs. IEEE Transactions on Computers, 55(7), 785-799. doi:10.1109/tc.2006.108Fisher, N., & Baruah, S. (2008). The feasibility of general task systems with precedence constraints on multiprocessor platforms. Real-Time Systems, 41(1), 1-26. doi:10.1007/s11241-008-9054-5Buttazzo, G., Bini, E., & Yifan Wu. (2011). Partitioning Real-Time Applications Over Multicore Reservations. IEEE Transactions on Industrial Informatics, 7(2), 302-315. doi:10.1109/tii.2011.2123902Intel Pentium M processor datasheet INTEL Corp. Santa Clara, CA, USA 2004 http://download.intel.com/support/processors/mobile/pm/sb/25261203.pdfChaparro, P., Gonzáles, J., Magklis, G., Cai, Q., & González, A. (2007). Understanding the Thermal Implications of Multi-Core Architectures. IEEE Transactions on Parallel and Distributed Systems, 18(8), 1055-1065. doi:10.1109/tpds.2007.1092WCET analysis project. WCET benchmark programs 2006 http://www.mrtc.mdh.se/projects/wcet
    • …
    corecore