981 research outputs found

    Resource Management for Data Intensive Tasks on Grids

    Get PDF

    Optimization and Job Scheduling in Heterogeneous Networks

    Get PDF
    The final publication is available at www.springerlink.comA heterogeneous network is a connected network of different platforms and operating systems. Job scheduling is a problem of selecting a free resource for unexecuted task from a pool of submitted tasks. Furthermore, it is required to find for every resource the best order of the tasks assigned to it. The purpose of this paper is to develop an efficient algorithm for job scheduling in heterogeneous networks. The algorithm should include parameters such as properties of resources and properties of jobs. The algorithm includes a cost function that is required to be optimized which includes parameters such as the total processing time, average waiting time. Our results demonstrate that the proposed algoritghm can be efficiently used to determine the performance of different job scheduling algorithms under different sets of loads.http://link.springer.com/chapter/10.1007%2F978-90-481-3662-9_4

    Low latency fast data computation scheme for map reduce based clusters

    Get PDF
    MapReduce based clusters is an emerging paradigm for big data analytics to scale up and speed up the big data classification, investigation, and processing of the huge volumes, massive and complex data sets. One of the fundamental issues of processing the data in MapReduce clusters is to deal with resource heterogeneity, especially when there is data inter-dependency among the tasks. Secondly, MapReduce runs a job in many phases; the intermediate data traffic and its migration time become a major bottleneck for the computation of jobs which produces a huge intermediate data in the shuffle phase. Further, encountering factors to monitor the critical issue of straggling is necessary because it produces unnecessary delays and poses a serious constraint on the overall performance of the system. Thus, this research aims to provide a low latency fast data computation scheme which introduces three algorithms to handle interdependent task computation among heterogeneous resources, reducing intermediate data traffic with its migration time and monitoring and modelling job straggling factors. This research has developed a Low Latency and Computational Cost based Tasks Scheduling (LLCC-TS) algorithm of interdependent tasks on heterogeneous resources by encountering priority to provide cost-effective resource utilization and reduced makespan. Furthermore, an Aggregation and Partition based Accelerated Intermediate Data Migration (APAIDM) algorithm has been presented to reduce the intermediate data traffic and data migration time in the shuffle phase by using aggregators and custom partitioner. Moreover, MapReduce Total Execution Time Prediction (MTETP) scheme for MapReduce job computation with inclusion of the factors which affect the job computation time has been produced using machine learning technique (linear regression) in order to monitor the job straggling and minimize the latency. LLCCTS algorithm has 66.13%, 22.23%, 43.53%, and 44.74% performance improvement rate over FIFO, improved max-min, SJF and MOS algorithms respectively for makespan time of scheduling of interdependent tasks. The AP-AIDM algorithm scored 66.62% and 48.4% performance improvements in reducing the data migration time over hash basic and conventional aggregation algorithms, respectively. Moreover, an MTETP technique shows the performance improvement in predicting the total job execution time with 20.42% accuracy than the improved HP technique. Thus, the combination of the three algorithms mentioned above provides a low latency fast data computation scheme for MapReduce based clusters

    Programming Heterogeneous Parallel Machines Using Refactoring and Monte-Carlo Tree Search

    Get PDF
    Funding: This work was supported by the EU Horizon 2020 project, TeamPlay, Grant Number 779882, and UK EPSRC Discovery, Grant Number EP/P020631/1.This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-memory systems (comprising a mixture of CPUs and GPUs), using a combination of algorithmic skeletons (such as farms and pipelines), Monte–Carlo tree search for deriving mappings of tasks to available hardware resources, and refactoring tool support for applying the patterns and mappings in an easy and effective way. Using our approach, we demonstrate easily obtainable, significant and scalable speedups on a number of case studies showing speedups of up to 41 over the sequential code on a 24-core machine with one GPU. We also demonstrate that the speedups obtained by mappings derived by the MCTS algorithm are within 5–15% of the best-obtained manual parallelisation.Publisher PDFPeer reviewe

    Workload Schedulers - Genesis, Algorithms and Comparisons

    Get PDF
    In this article we provide brief descriptions of three classes of schedulers: Operating Systems Process Schedulers, Cluster Systems, Jobs Schedulers and Big Data Schedulers. We describe their evolution from early adoptions to modern implementations, considering both the use and features of algorithms. In summary, we discuss differences between all presented classes of schedulers and discuss their chronological development. In conclusion, we highlight similarities in the focus of scheduling strategies design, applicable to both local and distributed systems
    corecore