12 research outputs found

    Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey

    Get PDF
    In the modern era, workflows are adopted as a powerful and attractive paradigm for expressing/solving a variety of applications like scientific, data intensive computing, and big data applications such as MapReduce and Hadoop. These complex applications are described using high-level representations in workflow methods. With the emerging model of cloud computing technology, scheduling in the cloud becomes the important research topic. Consequently, workflow scheduling problem has been studied extensively over the past few years, from homogeneous clusters, grids to the most recent paradigm, cloud computing. The challenges that need to be addressed lies in task-resource mapping, QoS requirements, resource provisioning, performance fluctuation, failure handling, resource scheduling, and data storage. This work focuses on the complete study of the resource provisioning and scheduling algorithms in cloud environment focusing on Infrastructure as a service (IaaS). We provided a comprehensive understanding of existing scheduling techniques and provided an insight into research challenges that will be a possible future direction to the researchers

    An efficient resource sharing technique for multi-tenant databases

    Get PDF
    Multi-tenancy is one of the key components of cloud computing environment. Multi-tenant database system in SaaS (Software as a Service) has gained a lot of attention in academics, research and business arena. These database systems provide scalability and economic benefits for both cloud service providers and customers(organizations/companies referred as tenants) by sharing same resources and infrastructure in isolation of shared databases, network and computing resources with Service level agreement (SLA) compliances. In a multitenant scenario, active tenants compete for resources in order to access the database. If one tenant blocks up the resources, the performance of all the other tenants may be restricted and a fair sharing of the resources may be compromised. The performance of tenants must not be affected by resource-intensive activities and volatile workloads of other tenants. Moreover, the prime goal of providers is to accomplish low cost of operation, satisfying specific schemas/SLAs of each tenant. Consequently, there is a need to design and develop effective and dynamic resource sharing algorithms which can handle above mentioned issues. This work presents a model embracing a query classification and worker sorting technique to efficiently share I/O, CPU and Memory thus enhancing dynamic resource sharing and improvising the utilization of idle instances proficiently. The model is referred as Multi-Tenant Dynamic Resource Scheduling Model (MTDRSM) .The MTDRSM support workload execution of different benchmark such as TPC-C(Transaction Processing Performance Council), YCSB(The Yahoo! Cloud Serving Benchmark)etc. and on different database such as MySQL, Oracle, H2 database etc. Experiments are conducted for different benchmarks with and without SLA compliances to evaluate the performance of MTDRSM in terms of latency and throughput achieved. The experiments show significant performance improvement over existing Mute Bench model in terms of latency and throughput

    A latency-aware max-min algorithm for resource allocation in cloud

    Get PDF
    Cloud computing is an emerging distributed computing paradigm. However, it requires certain initiatives that need to be tailored for the cloud environment such as the provision of an on-the-fly mechanism for providing resource availability based on the rapidly changing demands of the customers. Although, resource allocation is an important problem and has been widely studied, there are certain criteria that need to be considered. These criteria include meeting user’s quality of service (QoS) requirements. High QoS can be guaranteed only if resources are allocated in an optimal manner. This paper proposes a latency-aware max-min algorithm (LAM) for allocation of resources in cloud infrastructures. The proposed algorithm was designed to address challenges associated with resource allocation such as variations in user demands and on-demand access to unlimited resources. It is capable of allocating resources in a cloud-based environment with the target of enhancing infrastructure-level performance and maximization of profits with the optimum allocation of resources. A priority value is also associated with each user, which is calculated by analytic hierarchy process (AHP). The results validate the superiority for LAM due to better performance in comparison to other state-of-the-art algorithms with flexibility in resource allocation for fluctuating resource demand patterns

    Performance optimization and energy efficiency of big-data computing workflows

    Get PDF
    Next-generation e-science is producing colossal amounts of data, now frequently termed as Big Data, on the order of terabyte at present and petabyte or even exabyte in the predictable future. These scientific applications typically feature data-intensive workflows comprised of moldable parallel computing jobs, such as MapReduce, with intricate inter-job dependencies. The granularity of task partitioning in each moldable job of such big data workflows has a significant impact on workflow completion time, energy consumption, and financial cost if executed in clouds, which remains largely unexplored. This dissertation conducts an in-depth investigation into the properties of moldable jobs and provides an experiment-based validation of the performance model where the total workload of a moldable job increases along with the degree of parallelism. Furthermore, this dissertation conducts rigorous research on workflow execution dynamics in resource sharing environments and explores the interactions between workflow mapping and task scheduling on various computing platforms. A workflow optimization architecture is developed to seamlessly integrate three interrelated technical components, i.e., resource allocation, job mapping, and task scheduling. Cloud computing provides a cost-effective computing platform for big data workflows where moldable parallel computing models are widely applied to meet stringent performance requirements. Based on the moldable parallel computing performance model, a big-data workflow mapping model is constructed and a workflow mapping problem is formulated to minimize workflow makespan under a budget constraint in public clouds. This dissertation shows this problem to be strongly NP-complete and designs i) a fully polynomial-time approximation scheme for a special case with a pipeline-structured workflow executed on virtual machines of a single class, and ii) a heuristic for a generalized problem with an arbitrary directed acyclic graph-structured workflow executed on virtual machines of multiple classes. The performance superiority of the proposed solution is illustrated by extensive simulation-based results in Hadoop/YARN in comparison with existing workflow mapping models and algorithms. Considering that large-scale workflows for big data analytics have become a main consumer of energy in data centers, this dissertation also delves into the problem of static workflow mapping to minimize the dynamic energy consumption of a workflow request under a deadline constraint in Hadoop clusters, which is shown to be strongly NP-hard. A fully polynomial-time approximation scheme is designed for a special case with a pipeline-structured workflow on a homogeneous cluster and a heuristic is designed for the generalized problem with an arbitrary directed acyclic graph-structured workflow on a heterogeneous cluster. This problem is further extended to a dynamic version with deadline-constrained MapReduce workflows to minimize dynamic energy consumption in Hadoop clusters. This dissertation proposes a semi-dynamic online scheduling algorithm based on adaptive task partitioning to reduce dynamic energy consumption while meeting performance requirements from a global perspective, and also develops corresponding system modules for algorithm implementation in the Hadoop ecosystem. The performance superiority of the proposed solutions in terms of dynamic energy saving and deadline missing rate is illustrated by extensive simulation results in comparison with existing algorithms, and further validated through real-life workflow implementation and experiments using the Oozie workflow engine in Hadoop/YARN systems

    Ordonnancement multi-objectifs de workflows dans le cloud : un modèle plus réaliste avec tâches de durée stochastique

    Get PDF
    National audienceLa souplesse en terme de disponibilité de ressources que permet le cloud rend possible une adaptation de l'ordonnancement des tâches qui y sont exécutées face à l'imprévisibilité de cer-tains paramètres. Cependant, les méthodes d'ordonnancement existantes utilisent des modèles trop simplifiés : les workflows sont totalement déterministes ou leur structure n'est pas consi-dérée, ou encore la modélisation du cloud ignore certains aspects capitaux de cette plateforme. Cet article propose un modèle prenant en compte le fait que le nombre d'instructions consti-tuant une tâche peut ne pas être déterministe sans pour autant sacrifier totalement la com-plexité de la plateforme ou la structure du workflow. Nous proposons en outre quelques pistes pour l'élaboration d'une méthode d'ordonnancement multi-objectifs reposant sur ce modèle

    Time critical requirements and technical considerations for advanced support environments for data-intensive research

    Get PDF
    Data-centric approaches play an increasing role in many scientific domains, but in turn rely increasingly heavily on advanced research support environments for coordinating research activities, providing access to research data, and choreographing complex experiments. Critical time constraints can be seen in several application scenarios e.g., event detection for disaster early warning, runtime execution steering, and failure recovery. Providing support for executing such time critical research applications is still a challenging issue in many current research support environments however. In this paper, we analyse time critical requirements in three key kinds of research support environment—Virtual Research Environments, Research Infrastructures, and e-Infrastructures—and review the current state of the art. An approach for dynamic infrastructure planning is discussed that may help to address some of these requirements. The work is based on requirements collection recently performed in three EU H2020 projects: SWITCH, ENVRIPLUS and VRE4EIC
    corecore