1 research outputs found
Spark on Entropy: A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud
In heterogeneous cloud, the provision of quality of
service (QoS) guarantees for on-line parallel analysis jobs is much
more challenging than off-line ones, mainly due to the many
involved parameters, unstable resource performance, various job
pattern and dynamic query workload. In this paper we propose
an entropy-based scheduling strategy for running the on-line
parallel analysis as a service more reliable and efficient, and
implement the proposed idea in Spark.
Entropy, as a measure of the degree of disorder in a system,
is an indicator of a system’s tendency to progress out of order
and into a chaotic condition, and it can thus serve to measure a
cloud resource’s reliability for jobs scheduling. The key idea of
our Entropy Scheduler is to construct the new resource entropy
metric and schedule tasks according to the resources ranking with
the help of the new metric so as to provide QoS guarantees for
on-line Spark analysis jobs. Experiments demonstrate that our
approach significantly reduces the average query response time
by 15% - 20% and standard deviation by 30% - 45% compare
with the native Fair Scheduler in Spark