548 research outputs found

    Scheduling with processing set restrictions : a survey

    Get PDF
    2008-2009 > Academic research: refereed > Publication in refereed journalAccepted ManuscriptPublishe

    Scheduling with unexpected machine breakdowns

    Get PDF
    AbstractWe investigate an online version of a basic scheduling problem where a set of jobs has to be scheduled on a number of identical machines so as to minimize the makespan. The job processing times are known in advance and preemption of jobs is allowed. Machines are non-continuously available, i.e., they can break down and recover at arbitrary time instances not known in advance. New machines may be added as well. Thus machine availabilities change online. We first show that no online algorithm can construct optimal schedules. We also show that no online algorithm can achieve a bounded competitive ratio if there may be time intervals where no machine is available. Then we present an online algorithm that constructs schedules with an optimal makespan of CmaxOPT if a lookahead of one is given, i.e., the algorithm always knows the next point in time when the set of available machines changes. Finally, we give an online algorithm without lookahead that constructs schedules with a nearly optimal makespan of CmaxOPT+Īµ, for any Īµ>0, if at any time at least one machine is available. Our results demonstrate that not knowing machine availabilities in advance is of little harm

    Performance optimization and energy efficiency of big-data computing workflows

    Get PDF
    Next-generation e-science is producing colossal amounts of data, now frequently termed as Big Data, on the order of terabyte at present and petabyte or even exabyte in the predictable future. These scientific applications typically feature data-intensive workflows comprised of moldable parallel computing jobs, such as MapReduce, with intricate inter-job dependencies. The granularity of task partitioning in each moldable job of such big data workflows has a significant impact on workflow completion time, energy consumption, and financial cost if executed in clouds, which remains largely unexplored. This dissertation conducts an in-depth investigation into the properties of moldable jobs and provides an experiment-based validation of the performance model where the total workload of a moldable job increases along with the degree of parallelism. Furthermore, this dissertation conducts rigorous research on workflow execution dynamics in resource sharing environments and explores the interactions between workflow mapping and task scheduling on various computing platforms. A workflow optimization architecture is developed to seamlessly integrate three interrelated technical components, i.e., resource allocation, job mapping, and task scheduling. Cloud computing provides a cost-effective computing platform for big data workflows where moldable parallel computing models are widely applied to meet stringent performance requirements. Based on the moldable parallel computing performance model, a big-data workflow mapping model is constructed and a workflow mapping problem is formulated to minimize workflow makespan under a budget constraint in public clouds. This dissertation shows this problem to be strongly NP-complete and designs i) a fully polynomial-time approximation scheme for a special case with a pipeline-structured workflow executed on virtual machines of a single class, and ii) a heuristic for a generalized problem with an arbitrary directed acyclic graph-structured workflow executed on virtual machines of multiple classes. The performance superiority of the proposed solution is illustrated by extensive simulation-based results in Hadoop/YARN in comparison with existing workflow mapping models and algorithms. Considering that large-scale workflows for big data analytics have become a main consumer of energy in data centers, this dissertation also delves into the problem of static workflow mapping to minimize the dynamic energy consumption of a workflow request under a deadline constraint in Hadoop clusters, which is shown to be strongly NP-hard. A fully polynomial-time approximation scheme is designed for a special case with a pipeline-structured workflow on a homogeneous cluster and a heuristic is designed for the generalized problem with an arbitrary directed acyclic graph-structured workflow on a heterogeneous cluster. This problem is further extended to a dynamic version with deadline-constrained MapReduce workflows to minimize dynamic energy consumption in Hadoop clusters. This dissertation proposes a semi-dynamic online scheduling algorithm based on adaptive task partitioning to reduce dynamic energy consumption while meeting performance requirements from a global perspective, and also develops corresponding system modules for algorithm implementation in the Hadoop ecosystem. The performance superiority of the proposed solutions in terms of dynamic energy saving and deadline missing rate is illustrated by extensive simulation results in comparison with existing algorithms, and further validated through real-life workflow implementation and experiments using the Oozie workflow engine in Hadoop/YARN systems

    Single machine scheduling with periodic machine availability

    Get PDF
    In this paper we address the problem of scheduling jobs on a single machine with cyclical machine availability periods. In this problem, the scheduling horizon is composed of periods where the machine is available followed by other periods where no operation can be performed. In the literature, the problem is denoted as scheduling with periodic maintenance, as it is usually assumed that these unavailability periods are employed to perform maintenance activities. Another situation is the one inspiring our research, i.e. the need of completing manufacturing operations within a shift. More specifically, we focus the single machine scheduling problem with makespan objective subject to periodic machine availability. There are several contributions proposing approximate procedures due to the NP-hardness shown for the problem. However, we are not aware of a computational evaluation among these procedures. Furthermore, the problem is similar to the classical bin packing problem, so it is of interest to explore the relation between both problems. In this paper, we address these two issues, and propose new approximate solution procedures for the problem.Ministerio de Ciencia e InnovaciĆ³n DPI2016-80750-
    • ā€¦
    corecore