1 research outputs found

    Spark on Entropy: A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud

    Get PDF
    In heterogeneous cloud, the provision of quality of service (QoS) guarantees for on-line parallel analysis jobs is much more challenging than off-line ones, mainly due to the many involved parameters, unstable resource performance, various job pattern and dynamic query workload. In this paper we propose an entropy-based scheduling strategy for running the on-line parallel analysis as a service more reliable and efficient, and implement the proposed idea in Spark. Entropy, as a measure of the degree of disorder in a system, is an indicator of a system’s tendency to progress out of order and into a chaotic condition, and it can thus serve to measure a cloud resource’s reliability for jobs scheduling. The key idea of our Entropy Scheduler is to construct the new resource entropy metric and schedule tasks according to the resources ranking with the help of the new metric so as to provide QoS guarantees for on-line Spark analysis jobs. Experiments demonstrate that our approach significantly reduces the average query response time by 15% - 20% and standard deviation by 30% - 45% compare with the native Fair Scheduler in Spark
    corecore