8,954 research outputs found

    MARACAS: a real-time multicore VCPU scheduling framework

    Full text link
    This paper describes a multicore scheduling and load-balancing framework called MARACAS, to address shared cache and memory bus contention. It builds upon prior work centered around the concept of virtual CPU (VCPU) scheduling. Threads are associated with VCPUs that have periodically replenished time budgets. VCPUs are guaranteed to receive their periodic budgets even if they are migrated between cores. A load balancing algorithm ensures VCPUs are mapped to cores to fairly distribute surplus CPU cycles, after ensuring VCPU timing guarantees. MARACAS uses surplus cycles to throttle the execution of threads running on specific cores when memory contention exceeds a certain threshold. This enables threads on other cores to make better progress without interference from co-runners. Our scheduling framework features a novel memory-aware scheduling approach that uses performance counters to derive an average memory request latency. We show that latency-based memory throttling is more effective than rate-based memory access control in reducing bus contention. MARACAS also supports cache-aware scheduling and migration using page recoloring to improve performance isolation amongst VCPUs. Experiments show how MARACAS reduces multicore resource contention, leading to improved task progress.http://www.cs.bu.edu/fac/richwest/papers/rtss_2016.pdfAccepted manuscrip

    On Asymptotic Optimality of Dual Scheduling Algorithm In A Generalized Switch

    Get PDF
    Generalized switch is a model of a queueing system where parallel servers are interdependent and have time-varying service capabilities. This paper considers the dual scheduling algorithm that uses rate control and queue-length based scheduling to allocate resources for a generalized switch. We consider a saturated system in which each user has infinite amount of data to be served. We prove the asymptotic optimality of the dual scheduling algorithm for such a system, which says that the vector of average service rates of the scheduling algorithm maximizes some aggregate concave utility functions. As the fairness objectives can be achieved by appropriately choosing utility functions, the asymptotic optimality establishes the fairness properties of the dual scheduling algorithm. The dual scheduling algorithm motivates a new architecture for scheduling, in which an additional queue is introduced to interface the user data queue and the time-varying server and to modulate the scheduling process, so as to achieve different performance objectives. Further research would include scheduling with Quality of Service guarantees with the dual scheduler, and its application and implementation in various versions of the generalized switch model

    Scheduling for next generation WLANs: filling the gap between offered and observed data rates

    Get PDF
    In wireless networks, opportunistic scheduling is used to increase system throughput by exploiting multi-user diversity. Although recent advances have increased physical layer data rates supported in wireless local area networks (WLANs), actual throughput realized are significantly lower due to overhead. Accordingly, the frame aggregation concept is used in next generation WLANs to improve efficiency. However, with frame aggregation, traditional opportunistic schemes are no longer optimal. In this paper, we propose schedulers that take queue and channel conditions into account jointly, to maximize throughput observed at the users for next generation WLANs. We also extend this work to design two schedulers that perform block scheduling for maximizing network throughput over multiple transmission sequences. For these schedulers, which make decisions over long time durations, we model the system using queueing theory and determine users' temporal access proportions according to this model. Through detailed simulations, we show that all our proposed algorithms offer significant throughput improvement, better fairness, and much lower delay compared with traditional opportunistic schedulers, facilitating the practical use of the evolving standard for next generation wireless networks

    Multi-core job submission and grid resource scheduling for ATLAS AthenaMP

    Get PDF
    AthenaMP is the multi-core implementation of the ATLAS software framework and allows the efficient sharing of memory pages between multiple threads of execution. This has now been validated for production and delivers a significant reduction on the overall application memory footprint with negligible CPU overhead. Before AthenaMP can be routinely run on the LHC Computing Grid it must be determined how the computing resources available to ATLAS can best exploit the notable improvements delivered by switching to this multi-process model. A study into the effectiveness and scalability of AthenaMP in a production environment will be presented. Best practices for configuring the main LRMS implementations currently used by grid sites will be identified in the context of multi-core scheduling optimisation
    corecore