Search CORE

2,977 research outputs found

Distributed data mining in grid computing environments

Author: Cannataro
Cannataro
Fernandez-Baca
Kevin Lü
Kwok
Ping Luo
Qing He
Zhongzhi Shi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

The official published version of this article can be found at the link below.The computing-intensive data mining for inherently Internet-wide distributed data, referred to as Distributed Data Mining (DDM), calls for the support of a powerful Grid with an effective scheduling framework. DDM often shares the computing paradigm of local processing and global synthesizing. It involves every phase of Data Mining (DM) processes, which makes the workflow of DDM very complex and can be modelled only by a Directed Acyclic Graph (DAG) with multiple data entries. Motivated by the need for a practical solution of the Grid scheduling problem for the DDM workflow, this paper proposes a novel two-phase scheduling framework, including External Scheduling and Internal Scheduling, on a two-level Grid architecture (InterGrid, IntraGrid). Currently a DM IntraGrid, named DMGCE (Data Mining Grid Computing Environment), has been developed with a dynamic scheduling framework for competitive DAGs in a heterogeneous computing environment. This system is implemented in an established Multi-Agent System (MAS) environment, in which the reuse of existing DM algorithms is achieved by encapsulating them into agents. Practical classification problems from oil well logging analysis are used to measure the system performance. The detailed experiment procedure and result analysis are also discussed in this paper

Crossref

Brunel University Research Archive

Cost-Efficient Scheduling for Deadline Constrained Grid Workflows

Author: Dehlaghi-Ghadim Alireza
Entezari-Maleki Reza
Movaghar Ali
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 07/11/2018
Field of study

Cost optimization for workflow scheduling while meeting deadline is one of the fundamental problems in utility computing. In this paper, a two-phase cost-efficient scheduling algorithm called critical chain is presented. The proposed algorithm uses the concept of slack time in both phases. The first phase is deadline distribution over all tasks existing in the workflow which is done considering critical path properties of workflow graphs. Critical chain uses slack time to iteratively select most critical sequence of tasks and then assigns sub-deadlines to those tasks. In the second phase named mapping step, it tries to allocate a server to each task considering task's sub-deadline. In the mapping step, slack time priority in selecting ready task is used to reduce deadline violation. Furthermore, the algorithm tries to locally optimize the computation and communication costs of sequential tasks exploiting dynamic programming. After proposing the scheduling algorithm, three measures for the superiority of a scheduling algorithm are introduced, and the proposed algorithm is compared with other existing algorithms considering the measures. Results obtained from simulating various systems show that the proposed algorithm outperforms four well-known existing workflow scheduling algorithms

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey

Author: Chakravarthi K. Kalyana
Vijayakumar Vaidhehi
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/04/2018
Field of study

In the modern era, workflows are adopted as a powerful and attractive paradigm for expressing/solving a variety of applications like scientific, data intensive computing, and big data applications such as MapReduce and Hadoop. These complex applications are described using high-level representations in workflow methods. With the emerging model of cloud computing technology, scheduling in the cloud becomes the important research topic. Consequently, workflow scheduling problem has been studied extensively over the past few years, from homogeneous clusters, grids to the most recent paradigm, cloud computing. The challenges that need to be addressed lies in task-resource mapping, QoS requirements, resource provisioning, performance fluctuation, failure handling, resource scheduling, and data storage. This work focuses on the complete study of the resource provisioning and scheduling algorithms in cloud environment focusing on Infrastructure as a service (IaaS). We provided a comprehensive understanding of existing scheduling techniques and provided an insight into research challenges that will be a possible future direction to the researchers

IAES journal

Crossref

Institute of Advanced Engineering and Science

The Contemporary Affirmation of Taxonomy and Recent Literature on Workflow Scheduling and Management in Cloud Computing

Author: V. Murali Mohan
Publication venue: Global Journals Inc. (US)
Publication date: 29/02/2016
Field of study

The Cloud computing systemspreferred over the traditional forms of computing such as grid computing, utility computing, autonomic computing is attributed forits ease of access to computing, for its QoS preferences, SLA2019;s conformity, security and performance offered with minimal supervision. A cloud workflow schedule when designed efficiently achieves optimalre source sage, balance of workloads, deadline specific execution, cost control according to budget specifications, efficient consumption of energy etc. to meet the performance requirements of today2019; svast scientific and business requirements. The businesses requirements under recent technologies like pervasive computing are motivating the technology of cloud computing for further advancements. In this paper we discuss some of the important literature published on cloud workflow scheduling

Global Journal of Computer Science and Technology (GJCST)

Dual-phase just-in-time workflow scheduling in P2P grid systems

Author: Di S
Wang CL
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

This paper presents a fully decentralized justin-time workflow scheduling method in a P2P Grid system. The proposed solution allows each peer node to autonomously dispatch inter-dependent tasks of workflows to run on geographically distributed computers. To reduce the workflow completion time and enhance the overall execution efficiency, not only does each node perform as a scheduler to distribute its tasks to execution nodes (or resource nodes), but the resource nodes will also set the execution priorities for the received tasks. By taking into account the unpredictability of tasks' finish time, we devise an efficient task scheduling heuristic, namely dynamic shortest makespan first (DSMF), which could be applied at both scheduling phases for determining the priority of the workflow tasks. We compare the performance of the proposed algorithm against seven other heuristics by simulation. Our algorithm achieves 20%~60% reduction on the average completion time and 37.5%~90% improvement on the average workflow execution efficiency over other decentralized algorithms. © 2010 IEEE.published_or_final_versionProcessing (ICPP 2010), San Diego, CA., 13-16 September 2010. In Proceedings of the 39th ICCP, 2010, p. 238-24

HKU Scholars Hub