11,411 research outputs found
Managing Uncertainty: A Case for Probabilistic Grid Scheduling
The Grid technology is evolving into a global, service-orientated
architecture, a universal platform for delivering future high demand
computational services. Strong adoption of the Grid and the utility computing
concept is leading to an increasing number of Grid installations running a wide
range of applications of different size and complexity. In this paper we
address the problem of elivering deadline/economy based scheduling in a
heterogeneous application environment using statistical properties of job
historical executions and its associated meta-data. This approach is motivated
by a study of six-month computational load generated by Grid applications in a
multi-purpose Grid cluster serving a community of twenty e-Science projects.
The observed job statistics, resource utilisation and user behaviour is
discussed in the context of management approaches and models most suitable for
supporting a probabilistic and autonomous scheduling architecture
Distributed data mining in grid computing environments
The official published version of this article can be found at the link below.The computing-intensive data mining for inherently Internet-wide distributed data, referred to as Distributed Data Mining (DDM), calls for the support of a powerful Grid with an effective scheduling framework. DDM often shares the computing paradigm of local processing and global synthesizing. It involves every phase of Data Mining (DM) processes, which makes the workflow of DDM very complex and can be modelled only by a Directed Acyclic Graph (DAG) with multiple data entries. Motivated by the need for a practical solution of the Grid scheduling problem for the DDM workflow, this paper proposes a novel two-phase scheduling framework, including External Scheduling and Internal Scheduling, on a two-level Grid architecture (InterGrid, IntraGrid). Currently a DM IntraGrid, named DMGCE (Data Mining Grid Computing Environment), has been developed with a dynamic scheduling framework for competitive DAGs in a heterogeneous computing environment. This system is implemented in an established Multi-Agent System (MAS) environment, in which the reuse of existing DM algorithms is achieved by encapsulating them into agents. Practical classification problems from oil well logging analysis are used to measure the system performance. The detailed experiment procedure and result analysis are also discussed in this paper
A WOA-based optimization approach for task scheduling in cloud Computing systems
Task scheduling in cloud computing can directly
affect the resource usage and operational cost of a system. To
improve the efficiency of task executions in a cloud, various
metaheuristic algorithms, as well as their variations, have been
proposed to optimize the scheduling. In this work, for the
first time, we apply the latest metaheuristics WOA (the whale
optimization algorithm) for cloud task scheduling with a multiobjective optimization model, aiming at improving the performance of a cloud system with given computing resources. On that
basis, we propose an advanced approach called IWC (Improved
WOA for Cloud task scheduling) to further improve the optimal
solution search capability of the WOA-based method. We present
the detailed implementation of IWC and our simulation-based
experiments show that the proposed IWC has better convergence
speed and accuracy in searching for the optimal task scheduling
plans, compared to the current metaheuristic algorithms. Moreover, it can also achieve better performance on system resource
utilization, in the presence of both small and large-scale tasks
Workflow Partitioning and Deployment on the Cloud using Orchestra
Orchestrating service-oriented workflows is typically based on a design model
that routes both data and control through a single point - the centralised
workflow engine. This causes scalability problems that include the unnecessary
consumption of the network bandwidth, high latency in transmitting data between
the services, and performance bottlenecks. These problems are highly prominent
when orchestrating workflows that are composed from services dispersed across
distant geographical locations. This paper presents a novel workflow
partitioning approach, which attempts to improve the scalability of
orchestrating large-scale workflows. It permits the workflow computation to be
moved towards the services providing the data in order to garner optimal
performance results. This is achieved by decomposing the workflow into smaller
sub workflows for parallel execution, and determining the most appropriate
network locations to which these sub workflows are transmitted and subsequently
executed. This paper demonstrates the efficiency of our approach using a set of
experimental workflows that are orchestrated over Amazon EC2 and across several
geographic network regions.Comment: To appear in Proceedings of the IEEE/ACM 7th International Conference
on Utility and Cloud Computing (UCC 2014
- …