Search CORE

2,617 research outputs found

Bulk Scheduling with the DIANA Scheduler

Author: Ali Arshad
Anjum Ashiq
McClatchey Richard
Willers Ian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/08/2006
Field of study

Results from the research and development of a Data Intensive and Network Aware (DIANA) scheduling engine, to be used primarily for data intensive sciences such as physics analysis, are described. In Grid analyses, tasks can involve thousands of computing, data handling, and network resources. The central problem in the scheduling of these resources is the coordinated management of computation and data at multiple locations and not just data replication or movement. However, this can prove to be a rather costly operation and efficient sing can be a challenge if compute and data resources are mapped without considering network costs. We have implemented an adaptive algorithm within the so-called DIANA Scheduler which takes into account data location and size, network performance and computation capability in order to enable efficient global scheduling. DIANA is a performance-aware and economy-guided Meta Scheduler. It iteratively allocates each job to the site that is most likely to produce the best performance as well as optimizing the global queue for any remaining jobs. Therefore it is equally suitable whether a single job is being submitted or bulk scheduling is being performed. Results indicate that considerable performance improvements can be gained by adopting the DIANA scheduling approach.Comment: 12 pages, 11 figures. To be published in the IEEE Transactions in Nuclear Science, IEEE Press. 200

arXiv.org e-Print Archive

Crossref

UWE Bristol Research Repository

DIANA Scheduling Hierarchies for Optimizing Bulk Job Scheduling

Author: Ali A.
Alvi O.
Anjum A.
Hasham K.
McClatchey R.
Sagheer M.
Stockinger H.
Thomas M.
Willers I.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2006
Field of study

The use of meta-schedulers for resource management in large-scale distributed systems often leads to a hierarchy of schedulers. In this paper, we discuss why existing meta-scheduling hierarchies are sometimes not sufficient for Grid systems due to their inability to re-organise jobs already scheduled locally. Such a job re-organisation is required to adapt to evolving loads which are common in heavily used Grid infrastructures. We propose a peer-to-peer scheduling model and evaluate it using case studies and mathematical modelling. We detail the DIANA (Data Intensive and Network Aware) scheduling algorithm and its queue management system for coping with the load distribution and for supporting bulk job scheduling. We demonstrate that such a system is beneficial for dynamic, distributed and self-organizing resource management and can assist in optimizing load or job distribution in complex Grid infrastructures.Comment: 8 pages, 9 figures. Presented at the 2nd IEEE Int Conference on eScience & Grid Computing. Amsterdam Netherlands, December 200

arXiv.org e-Print Archive

Crossref

Caltech Authors

DIANA Scheduling Hierarchies for Optimizing Bulk Job Scheduling

Author: Ali Arshad
Alvi Omer
Anjum Ashiq
Hasham Khawar
McClatchey Richard
Sagheer Muhammad
Stockinger Heinz
Thomas Michael
Willers Ian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2006
Field of study

The use of meta-schedulers for resource management in large-scale distributed systems often leads to a hierarchy of schedulers. In this paper, we discuss why existing meta-scheduling hierarchies are sometimes not sufficient for Grid systems due to their inability to re-organise jobs already scheduled locally. Such a job re-organisation is required to adapt to evolving loads which are common in heavily used Grid infrastructures. We propose a peer-topeer scheduling model and evaluate it using case studies and mathematical modelling. We detail the DIANA (Data Intensive and Network Aware) scheduling algorithm and its queue management system for coping with the load distribution and for supporting bulk job scheduling. We demonstrate that such a system is beneficial for dynamic, distributed and self-organizing resource management and can assist in optimizing load or job distribution in complex Grid infrastructures

HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges

Author: Buyya Rajkumar
Calheiros Rodrigo N.
Cunha Renato L. F.
Netto Marco A. S.
Rodrigues Eduardo R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

High Performance Computing (HPC) clouds are becoming an alternative to on-premise clusters for executing scientific applications and business analytics services. Most research efforts in HPC cloud aim to understand the cost-benefit of moving resource-intensive applications from on-premise environments to public cloud platforms. Industry trends show hybrid environments are the natural path to get the best of the on-premise and cloud resources---steady (and sensitive) workloads can run on on-premise resources and peak demand can leverage remote resources in a pay-as-you-go manner. Nevertheless, there are plenty of questions to be answered in HPC cloud, which range from how to extract the best performance of an unknown underlying platform to what services are essential to make its usage easier. Moreover, the discussion on the right pricing and contractual models to fit small and large users is relevant for the sustainability of HPC clouds. This paper brings a survey and taxonomy of efforts in HPC cloud and a vision on what we believe is ahead of us, including a set of research challenges that, once tackled, can help advance businesses and scientific discoveries. This becomes particularly relevant due to the fast increasing wave of new HPC applications coming from big data and artificial intelligence.Comment: 29 pages, 5 figures, Published in ACM Computing Surveys (CSUR

arXiv.org e-Print Archive

Western Sydney ResearchDirect

Towards In-Transit Analytics for Industry 4.0

Author: Ali Muhammad
Anjum Ashiq
Devitt James
Hill Richard
Publication venue
Publication date: 01/06/2017
Field of study

Industry 4.0, or Digital Manufacturing, is a vision of inter-connected services to facilitate innovation in the manufacturing sector. A fundamental requirement of innovation is the ability to be able to visualise manufacturing data, in order to discover new insight for increased competitive advantage. This article describes the enabling technologies that facilitate In-Transit Analytics, which is a necessary precursor for Industrial Internet of Things (IIoT) visualisation.Comment: 8 pages, 10th IEEE International Conference on Internet of Things (iThings-2017), Exeter, UK, 201

arXiv.org e-Print Archive

Crossref

Huddersfield Research Portal

Passive network awareness as a means for improved grid scheduling

Author: Edwards Chris
Elkhatib Yehia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/04/2015
Field of study

Grids enable sharing resources of heterogeneous nature and administration. In such distributed systems, the network is usually taken for granted which is potentially problematic due to the complexity and unpredictability of public networks that typically underlie grids. This article introduces GridMAP, a mechanism for considering the network state for enhancing grid scheduling. Network measurements are collected in a passive manner from a user-centric vantage point. This mechanism has been evaluated on a production e-science grid infrastructure, with results showing the ability of GridMAP to improve grid scheduling with minimal network, computational and deployment overheads

Crossref

Lancaster E-Prints

A queueing theory approach to Pareto-optimal bags-of-tasks scheduling on clouds

Author: Mei R.D. (Rob) van der
Publication venue
Publication date: 01/08/2014
Field of study

Cloud hosting services offer computing resources which can scale along with the needs of users. When access to data is limited by the network capacity this scalability also becomes limited. To investigate the impact of this limitation we focus on bags{of{tasks where task data is stored outside the cloud and has to be transferred across the network before task execution can commence. The existing bags-of-tasks estimation tools are not able to provide accurate estimates in such a case. We introduce a queuing{network inspired model which successfully models the limited network resources. Based on the Mean{Value Analysis of this model we derive an efficient procedure that results with an estimate of the makespan and the executions costs for a given configuration of cloud virtual machines. We compare the calculated Pareto set with measurements performed in a number of experiments for real-world bags-of-tasks and validate the proposed model and the accuracy of the estimated configurations

CWI's Institutional Repository