693 research outputs found
Real-Time MapReduce Scheduling
In this paper, we explore the feasibility of enabling the scheduling of mixed hard and soft real-time MapReduce applications. We first present an experimental evaluation of the popular Hadoop MapReduce middleware on the Amazon EC2 cloud. Our evaluation reveals tradeoffs between overall system throughput and execution time predictability, as well as highlights a number of factors affecting real-time scheduling, such as data placement, concurrent users, and master scheduling overhead. Based on our evaluation study, we present a formal model for capturing real-time MapReduce applications and the Hadoop platform. Using this model, we formulate the offline scheduling of real-time MapReduce jobs on a heterogeneous distributed Hadoop architecture as a constraint satisfaction problem (CSP) and introduce various search strategies for the formulation. We propose an enhancement of MapReduce’s execution model and a range of heuristic techniques for the online scheduling. We further outline some of our future directions that apply state-of-the-art techniques in the real-time scheduling literature
Fair non-monetary scheduling in federated clouds
In a hybrid cloud, individual cloud service providers (CSPs) often have
incentive to use each other's resources to off-load peak loads or place load
closer to the end user. However, CSPs have to keep track of contributions and
gains in order to disincentivize long-term free-riding. We show CloudShare, a
distributed version of a load balancing algorithm DirectCloud based on the
Shapley value---a powerful fairness concept from game theory. CloudShare
coordinates CSPs by a ZooKeeper-based coordination layer; each CSP runs a
broker that interacts with local resources (such as Kubernetes-managed
clusters). We quantitatively evaluate our implementation by simulation. The
results confirm that CloudShare generates on the average more fair schedules
than the popular FairShare algorithm. We believe our results show an viable
alternative to monetary methods based on, e.g., spot markets.Comment: Accepted to CrossCloud'18: 5th Workshop on CrossCloud Infrastructures
& Platform
Solving the Batch Stochastic Bin Packing Problem in Cloud: A Chance-constrained Optimization Approach
This paper investigates a critical resource allocation problem in the first
party cloud: scheduling containers to machines. There are tens of services and
each service runs a set of homogeneous containers with dynamic resource usage;
containers of a service are scheduled daily in a batch fashion. This problem
can be naturally formulated as Stochastic Bin Packing Problem (SBPP). However,
traditional SBPP research often focuses on cases of empty machines, whose
objective, i.e., to minimize the number of used machines, is not well-defined
for the more common reality with nonempty machines. This paper aims to close
this gap. First, we define a new objective metric, Used Capacity at Confidence
(UCaC), which measures the maximum used resources at a probability and is
proved to be consistent for both empty and nonempty machines, and reformulate
the SBPP under chance constraints. Second, by modeling the container resource
usage distribution in a generative approach, we reveal that UCaC can be
approximated with Gaussian, which is verified by trace data of real-world
applications. Third, we propose an exact solver by solving the equivalent
cutting stock variant as well as two heuristics-based solvers -- UCaC best fit,
bi-level heuristics. We experimentally evaluate these solvers on both synthetic
datasets and real application traces, demonstrating our methodology's advantage
over traditional SBPP optimal solver minimizing the number of used machines,
with a low rate of resource violations.Comment: To appear in SIGKDD 2022 as Research Track pape
Broker-mediated Multiple-Cloud Orchestration Mechanisms for Cloud Computing
Ph.DDOCTOR OF PHILOSOPH
- …