1,840 research outputs found

    TimeTrader: Exploiting Latency Tail to Save Datacenter Energy for On-line Data-Intensive Applications

    Get PDF
    Datacenters running on-line, data-intensive applications (OLDIs) consume significant amounts of energy. However, reducing their energy is challenging due to their tight response time requirements. A key aspect of OLDIs is that each user query goes to all or many of the nodes in the cluster, so that the overall time budget is dictated by the tail of the replies' latency distribution; replies see latency variations both in the network and compute. Previous work proposes to achieve load-proportional energy by slowing down the computation at lower datacenter loads based directly on response times (i.e., at lower loads, the proposal exploits the average slack in the time budget provisioned for the peak load). In contrast, we propose TimeTrader to reduce energy by exploiting the latency slack in the sub- critical replies which arrive before the deadline (e.g., 80% of replies are 3-4x faster than the tail). This slack is present at all loads and subsumes the previous work's load-related slack. While the previous work shifts the leaves' response time distribution to consume the slack at lower loads, TimeTrader reshapes the distribution at all loads by slowing down individual sub-critical nodes without increasing missed deadlines. TimeTrader exploits slack in both the network and compute budgets. Further, TimeTrader leverages Earliest Deadline First scheduling to largely decouple critical requests from the queuing delays of sub- critical requests which can then be slowed down without hurting critical requests. A combination of real-system measurements and at-scale simulations shows that without adding to missed deadlines, TimeTrader saves 15-19% and 41-49% energy at 90% and 30% loading, respectively, in a datacenter with 512 nodes, whereas previous work saves 0% and 31-37%.Comment: 13 page

    A project to investigate mechanisms and methodologies for the design and construction of communicating concurrent processes in real-time environments

    Get PDF
    Research undertaken in 1979 into effective and appropriate mechanisms to aid in the design and construction of software for use in the flight research programs undertaken by NASA is presented

    Capacity sharing and stealing in serverbased real-time systems

    Get PDF
    A dynamic scheduler that supports the coexistence of guaranteed and non-guaranteed bandwidth servers is proposed. Overloads are handled by an efficient reclaiming of residual capacities originated by early completions as well as by allowing reserved capacity stealing of non-guaranteed bandwidth servers. The proposed dynamic budget accounting mechanism ensures that at a particular time the currently executing server is using a residual capacity, its own capacity or is stealing some reserved capacity, eliminating the need of additional server states or unbounded queues. The server to which the budget accounting is going to be performed is dynamically determined at the time instant when a capacity is needed. This paper describes and evaluates the proposed scheduling algorithm, showing that it can efficiently reduce the mean tardiness of periodic jobs. The achieved results become even more significant when tasks’ computation times have a large variance

    NUMFabric: Fast and Flexible Bandwidth Allocation in Datacenters

    Get PDF
    We present xFabric, a novel datacenter transport design that provides flexible and fast bandwidth allocation control. xFabric is flexible: it enables operators to specify how bandwidth is allocated amongst contending flows to optimize for different service-level objectives such as minimizing flow completion times, weighted allocations, different notions of fairness, etc. xFabric is also very fast, it converges to the specified allocation one-to-two order of magnitudes faster than prior schemes. Underlying xFabric, is a novel distributed algorithm that uses in-network packet scheduling to rapidly solve general network utility maximization problems for bandwidth allocation. We evaluate xFabric using realistic datacenter topologies and highly dynamic workloads and show that it is able to provide flexibility and fast convergence in such stressful environments.Google Faculty Research Awar

    Flexible Scheduling in Middleware for Distributed rate-based real-time applications - Doctoral Dissertation, May 2002

    Get PDF
    Distributed rate-based real-time systems, such as process control and avionics mission computing systems, have traditionally been scheduled statically. Static scheduling provides assurance of schedulability prior to run-time overhead. However, static scheduling is brittle in the face of unanticipated overload, and treats invocation-to-invocation variations in resource requirements inflexibly. As a consequence, processing resources are often under-utilized in the average case, and the resulting systems are hard to adapt to meet new real-time processing requirements. Dynamic scheduling offers relief from the limitations of static scheduling. However, dynamic scheduling offers relief from the limitations of static scheduling. However, dynamic scheduling often has a high run-time cost because certain decisions are enforced on-line. Furthermore, under conditions of overload tasks can be scheduled dynamically that may never be dispatched, or that upon dispatch would miss their deadlines. We review the implications of these factors on rate-based distributed systems, and posits the necessity to combine static and dynamic approaches to exploit the strengths and compensate for the weakness of either approach in isolation. We present a general hybrid approach to real-time scheduling and dispatching in middleware, that can employ both static and dynamic components. This approach provides (1) feasibility assurance for the most critical tasks, (2) the ability to extend this assurance incrementally to operations in successively lower criticality equivalence classes, (3) the ability to trade off bounds on feasible utilization and dispatching over-head in cases where, for example, execution jitter is a factor or rates are not harmonically related, and (4) overall flexibility to make more optimal use of scarce computing resources and to enforce a wider range of application-specified execution requirements. This approach also meets additional constraints of an increasingly important class of rate-based systems, those with requirements for robust management of real-time performance in the face of rapidly and widely changing operating conditions. To support these requirements, we present a middleware framework that implements the hybrid scheduling and dispatching approach described above, and also provides support for (1) adaptive re-scheduling of operations at run-time and (2) reflective alternation among several scheduling strategies to improve real-time performance in the face of changing operating conditions. Adaptive re-scheduling must be performed whenever operating conditions exceed the ability of the scheduling and dispatching infrastructure to meet the critical real-time requirements of the system under the currently specified rates and execution times of operations. Adaptive re-scheduling relies on the ability to change the rates of execution of at least some operations, and may occur under the control of a higher-level middleware resource manager. Different rates of execution may be specified under different operating conditions, and the number of such possible combinations may be arbitrarily large. Furthermore, adaptive rescheduling may in turn require notification of rate-sensitive application components. It is therefore desirable to handle variations in operating conditions entirely within the scheduling and dispatching infrastructure when possible. A rate-based distributed real-time application, or a higher-level resource manager, could thus fall back on adaptive re-scheduling only when it cannot achieve acceptable real-time performance through self-adaptation. Reflective alternation among scheduling heuristics offers a way to tune real-time performance internally, and we offer foundational support for this approach. In particular, run-time observable information such as that provided by our metrics-feedback framework makes it possible to detect that a given current scheduling heuristic is underperforming the level of service another could provide. Furthermore we present empirical results for our framework in a realistic avionics mission computing environment. This forms the basis for guided adaption. This dissertation makes five contributions in support of flexible and adaptive scheduling and dispatching in middleware. First, we provide a middle scheduling framework that supports arbitrary and fine-grained composition of static/dynamic scheduling, to assure critical timeliness constraints while improving noncritical performance under a range of conditions. Second, we provide a flexible dispatching infrastructure framework composed of fine-grained primitives, and describe how appropriate configurations can be generated automatically based on the output of the scheduling framework. Third, we describe algorithms to reduce the overhead and duration of adaptive rescheduling, based on sorting for rate selection and priority assignment. Fourth, we provide timely and efficient performance information through an optimized metrics-feedback framework, to support higher-level reflection and adaptation decisions. Fifth, we present the results of empirical studies to quantify and evaluate the performance of alternative canonical scheduling heuristics, across a range of load and load jitter conditions. These studies were conducted within an avionics mission computing applications framework running on realistic middleware and embedded hardware. The results obtained from these studies (1) demonstrate the potential benefits of reflective alternation among distinct scheduling heuristics at run-time, and (2) suggest performance factors of interest for future work on adaptive control policies and mechanisms using this framework

    On the schedulability of deadline-constrained traffic in TDMA Wireless Mesh Networks

    Get PDF
    In this paper, we evaluate the schedulability of traffic with arbitrary end-to-end deadline constraints in Wireless Mesh Networks (WMNs). We formulate the problem as a mixed integer linear optimization problem, and show that, depending on the flow aggregation policy used in the network, the problem can be either convex or non-convex. We optimally solve the problem in both cases, and prove that the schedulability does depend on the aggregation policy. This allows us to derive rules of thumb to identify which policy improves the schedulability with a given traffic. Furthermore, we propose a heuristic solution strategy that allows good suboptimal solutions to the scheduling problem to be computed in relatively small times, comparable to those required for online admission control in relatively large WMNs
    • …
    corecore