5 research outputs found

    A Pretreatment Workflow Scheduling Approach for Big Data Applications in Multicloud Environments

    Full text link
    The rapid development of the latest distributed computing paradigm, i.e., cloud computing, generates a highly fragmented cloud market composed of numerous cloud providers and offers tremendous parallel computing ability to handle big data problems. One of the biggest challenges in multiclouds is efficient workflow scheduling. Although the workflow scheduling problem has been studied extensively, there are still very few primal works tailored for multicloud environments. Moreover, the existing research works either fail to satisfy the quality of service (QoS) requirements, or do not consider some fundamental features of cloud computing such as heterogeneity and elasticity of computing resources. In this paper, a scheduling algorithm, which is called multiclouds partial critical paths with pretreatment (MCPCPP), for big data workflows in multiclouds is presented. This algorithm incorporates the concept of partial critical paths, and aims to minimize the execution cost of workflow while satisfying the defined deadline constraint. Our approach takes into consideration the essential characteristics of multiclouds such as the charge per time interval, various instance types from different cloud providers, as well as homogeneous intrabandwidth vs. heterogeneous interbandwidth. Various types of workflows are used for evaluation purpose and our experimental results show that the MCPCPP is promising

    Budget-aware scheduling algorithm for scientific workflow applications across multiple clouds. A Mathematical Optimization-Based Approach

    Get PDF
    Scientific workflows have become a prevailing means of achieving significant scientific advances at an ever-increasing rate. Scheduling mechanisms and approaches are vital to automating these large-scale scientific workflows efficiently. On the other hand, with the advent of cloud computing and its easier availability and lower cost of use, more attention has been paid to the execution and scheduling of scientific workflows in this new paradigm environment. For scheduling large-scale workflows, a multi-cloud environment will typically have a more significant advantage in various computing resources than a single cloud provider. Also, the scheduling makespan and cost can be reduced if the computing resources are used optimally in a multi-cloud environment. Accordingly, this thesis addressed the problem of scientific workflow scheduling in the multi-cloud environment under budget constraints to minimize associated makespan. Furthermore, this study tries to minimize costs, including fees for running VMs and data transfer, minimize the data transfer time, and fulfill budget and resource constraints in the multi-clouds scenario. To this end, we proposed Mixed-Integer Linear Programming (MILP) models that can be solved in a reasonable time by available solvers. We divided the workflow tasks into small segments, distributed them among VMs with multi-vCPU, and formulated them in mathematical programming. In the proposed mathematical model, the objective of a problem and real and physical constraints or restrictions are formulated using exact mathematical functions. We analyzed the treatment of optimal makespan under variations in budget, workflow size, and different segment sizes. The evaluation's results signify that our proposed approach has achieved logical and expected results in meeting the set objectives

    Scalable discovery of hybrid process models in a cloud computing environment

    Get PDF
    Process descriptions are used to create products and deliver services. To lead better processes and services, the first step is to learn a process model. Process discovery is such a technique which can automatically extract process models from event logs. Although various discovery techniques have been proposed, they focus on either constructing formal models which are very powerful but complex, or creating informal models which are intuitive but lack semantics. In this work, we introduce a novel method that returns hybrid process models to bridge this gap. Moreover, to cope with today’s big event logs, we propose an efficient method, called f-HMD, aims at scalable hybrid model discovery in a cloud computing environment. We present the detailed implementation of our approach over the Spark framework, and our experimental results demonstrate that the proposed method is efficient and scalabl

    A BRANCH AND BOUND ALGORITHM FOR WORKFLOW SCHEDULING

    Get PDF
    Nowadays, people are connected to the Internet and use different Cloud solutions to store, process and deliver data. The Cloud consists of a collection of virtual servers that promise to provision on-demand computational and storage resources when needed. Workflow data is becoming an ubiquitous term in both science and technology and there is a strong need for new  tools and techniques to process and analyze large-scale complex datasets that are growing exponentially. scientific workflow is a sequence of connected tasks with large data transfer from parent task to children tasks. Workflow scheduling is the activity of assigning tasks to execution on servers and satisfying resource constraints and this is an NP-hard problem. In this paper, we propose a scheduling algorithm for workflow data that is derived from the Branch and Bound Algorithm

    Big Data Analysis-Based Secure Cluster Management for Optimized Control Plane in Software-Defined Networks

    Get PDF
    In software-defined networks (SDNs), the abstracted control plane is its symbolic characteristic, whose core component is the software-based controller. The control plane is logically centralized, but the controllers can be physically distributed and composed of multiple nodes. To meet the service management requirements of large-scale network scenarios, the control plane is usually implemented in the form of distributed controller clusters. Cluster management technology monitors all types of events and must maintain a consistent global network status, which usually leads to big data in SDNs. Simultaneously, the cluster security is an open issue because of the programmable and dynamic features of SDNs. To address the above challenges, this paper proposes a big data analysis-based secure cluster management architecture for the optimized control plane. A security authentication scheme is proposed for cluster management. Moreover, we propose an ant colony optimization approach that enables big data analysis scheme and the implementation system that optimizes the control plane. Simulations and comparisons show the feasibility and efficiency of the proposed scheme. The proposed scheme is significant in improving the security and efficiency SDN control plane
    corecore