Search CORE

4,320 research outputs found

Scheduling for the tail: Robustness versus optimality

Author: de Haan Jakob
Li Suxiao
Scholtens Bert
Yang Haizhen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2010
Field of study

When scheduling to minimize the sojourn time tail, the goals of optimality and robustness are seemingly at odds. Over the last decade, results have emerged which show that scheduling disciplines that are near-optimal under light (exponential) tailed workload distributions do not perform well under heavy (power) tailed workload distributions, and vice-versa. Very recently, it has been shown that this conflict between optimality and robustness is fundamental, i.e., no policy that does not learn information about the workload can be optimal across both light-tailed and heavy-tailed workloads. In this paper we show that one can exploit very limited workload information (the system load) in order to design a scheduler that provides robust performance across heavy-tailed and light-tailed workloads

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Caltech Authors

University of St. Andrews - Pure

Dissertations of the University of Groningen

Separation of timescales in a two-layered network

Author: Vlasiou Maria
Zhang Jiheng
Zwart Bert
Publication venue
Publication date: 01/01/2014
Field of study

We investigate a computer network consisting of two layers occurring in, for example, application servers. The first layer incorporates the arrival of jobs at a network of multi-server nodes, which we model as a many-server Jackson network. At the second layer, active servers at these nodes act now as customers who are served by a common CPU. Our main result shows a separation of time scales in heavy traffic: the main source of randomness occurs at the (aggregate) CPU layer; the interactions between different types of nodes at the other layer is shown to converge to a fixed point at a faster time scale; this also yields a state-space collapse property. Apart from these fundamental insights, we also obtain an explicit approximation for the joint law of the number of jobs in the system, which is provably accurate for heavily loaded systems and performs numerically well for moderately loaded systems. The obtained results for the model under consideration can be applied to thread-pool dimensioning in application servers, while the technique seems applicable to other layered systems too.Comment: 8 pages, 2 figures, 1 table, ITC 24 (2012

arXiv.org e-Print Archive

Pure OAI Repository

Revisiting Actor Programming in C++

Author: Charousset Dominik
Hiesgen Raphael
Schmidt Thomas C.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

The actor model of computation has gained significant popularity over the last decade. Its high level of abstraction makes it appealing for concurrent applications in parallel and distributed systems. However, designing a real-world actor framework that subsumes full scalability, strong reliability, and high resource efficiency requires many conceptual and algorithmic additives to the original model. In this paper, we report on designing and building CAF, the "C++ Actor Framework". CAF targets at providing a concurrent and distributed native environment for scaling up to very large, high-performance applications, and equally well down to small constrained systems. We present the key specifications and design concepts---in particular a message-transparent architecture, type-safe message interfaces, and pattern matching facilities---that make native actors a viable approach for many robust, elastic, and highly distributed developments. We demonstrate the feasibility of CAF in three scenarios: first for elastic, upscaling environments, second for including heterogeneous hardware like GPGPUs, and third for distributed runtime systems. Extensive performance evaluations indicate ideal runtime behaviour for up to 64 cores at very low memory footprint, or in the presence of GPUs. In these tests, CAF continuously outperforms the competing actor environments Erlang, Charm++, SalsaLite, Scala, ActorFoundry, and even the OpenMPI.Comment: 33 page

arXiv.org e-Print Archive

REPOSIT

Datacenter Traffic Control: Understanding Techniques and Trade-offs

Author: Noormohammadpour Mohammad
Raghavendra Cauligi S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/12/2017
Field of study

Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

arXiv.org e-Print Archive

ZENODO

FigShare

The Parallel Persistent Memory Model

Author: Berryhill R.
Blelloch G. E.
Buettner M.
Chauhan H.
Herlihy M.
JaJa J.
Lee S. K.
Meena J. S.
Nawab F.
Pelley S.
Woude J. Van Der
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/06/2018
Field of study

We consider a parallel computational model that consists of

P

processors, each with a fast local ephemeral memory of limited size, and sharing a large persistent memory. The model allows for each processor to fault with bounded probability, and possibly restart. On faulting all processor state and local ephemeral memory are lost, but the persistent memory remains. This model is motivated by upcoming non-volatile memories that are as fast as existing random access memory, are accessible at the granularity of cache lines, and have the capability of surviving power outages. It is further motivated by the observation that in large parallel systems, failure of processors and their caches is not unusual. Within the model we develop a framework for developing locality efficient parallel algorithms that are resilient to failures. There are several challenges, including the need to recover from failures, the desire to do this in an asynchronous setting (i.e., not blocking other processors when one fails), and the need for synchronization primitives that are robust to failures. We describe approaches to solve these challenges based on breaking computations into what we call capsules, which have certain properties, and developing a work-stealing scheduler that functions properly within the context of failures. The scheduler guarantees a time bound of

O(W/P_A + D(P/P_A) \lceil\log_{1/f} W\rceil)

in expectation, where

W

and

D

are the work and depth of the computation (in the absence of failures),

P_A

is the average number of processors available during the computation, and

f \le 1/2

is the probability that a capsule fails. Within the model and using the proposed methods, we develop efficient algorithms for parallel sorting and other primitives.Comment: This paper is the full version of a paper at SPAA 2018 with the same nam

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Is Tail-Optimal Scheduling Possible?

Author: Wierman Adam
Zwart Bert
Publication venue: Institute for Operations Research and Management Sciences
Publication date: 01/01/2012
Field of study

This paper focuses on the competitive analysis of scheduling disciplines in a large deviations setting. Although there are policies that are known to optimize the sojourn time tail under a large class of heavy-tailed job sizes (e.g., processor sharing and shortest remaining processing time) and there are policies known to optimize the sojourn time tail in the case of light-tailed job sizes (e.g., first come first served), no policies are known that can optimize the sojourn time tail across both light- and heavy-tailed job size distributions. We prove that no such work-conserving, nonanticipatory, nonlearning policy exists, and thus that a policy must learn (or know) the job size distribution in order to optimize the sojourn time tail

CiteSeerX

Repository TU/e

VU Research Portal

CWI's Institutional Repository

Pure OAI Repository

Caltech Authors

Characterizing Policies with Optimal Response Time Tails under Heavy-Tailed Job Sizes

Author: Boxma Onno
Dorsman Jan-Pieter
Scully Ziv
van Kreveld Lucas
Wierman Adam
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

We consider the tail behavior of the response time distribution in an M/G/1 queue with heavy-tailed job sizes, specifically those with intermediately regularly varying tails. In this setting, the response time tail of many individual policies has been characterized, and it is known that policies such as Shortest Remaining Processing Time (SRPT) and Foreground-Background (FB) have response time tails of the same order as the job size tail, and thus such policies are tail-optimal. Our goal in this work is to move beyond individual policies and characterize the set of policies that are tail-optimal. Toward that end, we use the recently introduced SOAP framework to derive sufficient conditions on the form of prioritization used by a scheduling policy that ensure the policy is tail-optimal. These conditions are general and lead to new results for important policies that have previously resisted analysis, including the Gittins policy, which minimizes mean response time among policies that do not have access to job size information. As a by-product of our analysis, we derive a general upper bound for fractional moments of M/G/1 busy periods, which is of independent interest

Crossref

Caltech Authors

International Migration, Integration and Social Cohesion online publications

UvA-DARE