9 research outputs found

    Matroid Coflow Scheduling

    Get PDF
    We consider the matroid coflow scheduling problem, where each job is comprised of a set of flows and the family of sets that can be scheduled at any time form a matroid. Our main result is a polynomial-time algorithm that yields a 2-approximation for the objective of minimizing the weighted completion time. This result is tight assuming P != NP. As a by-product we also obtain the first (2+epsilon)-approximation algorithm for the preemptive concurrent open shop scheduling problem

    Integrality Gap of Time-Indexed Linear Programming Relaxation for Coflow Scheduling

    Get PDF
    Coflow is a set of related parallel data flows in a network. The goal of the coflow scheduling is to process all the demands of the given coflows while minimizing the weighted completion time. It is known that the coflow scheduling problem admits several polynomial-time 5-approximation algorithms that compute solutions by rounding linear programming (LP) relaxations of the problem. In this paper, we investigate the time-indexed LP relaxation for coflow scheduling. We show that the integrality gap of the time-indexed LP relaxation is at most 4. We also show that yet another polynomial-time 5-approximation algorithm can be obtained by rounding the solutions to the time-indexed LP relaxation

    ON SCHEDULING AND COMMUNICATION ISSUES IN DATA CENTERS

    Get PDF
    The proliferation of datacenters to handle the rapidly growing amount of data being managed in the cloud, necessitates the design, management and effective utilization of the thousands of machines that constitute a data center. Many modern big data applications require access to a large number of machines and datasets for training neural nets or for other big data processing. In this thesis, we present research challenges and progress along two fronts. The first challenge addresses the need to schedule communication between machines in a much more effective manner, as several running applications compete for network bandwidth. We address a basic question known as coflow scheduling to optimize the weighted average completion time of tasks that are running across different machines in a datacenter and to effectively handle their communication needs. Sometimes, we are forced to distribute a task among multiple datacenters due to cost or legal reasons. For this case, we also study a related model that addresses communication needs of tasks that process data on multiple data centers and handles communication requirements of such tasks across a wide area network with possibly widely varying bandwidth and network structures across different pairs of machines. The second challenge is from a cloud user's perspective - since access to resources such as those provided by Amazon AWS can be expensive at scale, cloud computing providers often sell under utilized resources at a significant discount via a spot instance market. However, these instances are not dedicated and while they offer a cheaper alternative, there is a chance that the user's job will be interrupted to make room for higher priority tasks. Certain non-critical applications are not significantly impacted by delays due to interruptions, and we develop an initial framework to study some basic scheduling questions under this circumstance. In all of these topics, the problems we study are NP-hard and our focus is on developing good approximation algorithms. In addition, while we attack these problems from a theoretical perspective, all the algorithms developed in this thesis are practical and efficient, and can be easily deployed in practice, some are already deployed

    Subject index volumes 1–92

    Get PDF

    Subject Index Volumes 1–200

    Get PDF
    corecore