74 research outputs found

    Steady-State for Batches of Identical Task Graphs

    Get PDF
    International audienceIn this paper, we focus on the problem of scheduling batches of identical task graphs on a heterogeneous platform, when the task graph consists in a tree. We rely on steady-state scheduling, and aim at reaching the optimal throughput of the system. Contrarily to previous studies, we concentrate upon the scheduling of batches of limited size. We try to reduce the processing time of each instance, thus making steady-state scheduling applicable to smaller batches. The problem is proven NP-complete, and a mixed integer program is presented to solve it. Then, different solutions, using steady-state scheduling or not, are evaluated through comprehensive simulations

    Static Scheduling Strategies for Heterogeneous Systems

    Get PDF
    In this paper, we consider static scheduling techniques for heterogeneous systems, such as clusters and grids. We successively deal with minimum makespan scheduling, divisible load scheduling and steady-state scheduling. Finally, we discuss the limitations of static scheduling approaches

    Comparison of Batch Scheduling for Identical Multi-Tasks Jobs on Heterogeneous Platforms

    No full text
    International audienceIn this paper we consider the scheduling of a batch of the same job on a heterogeneous execution platform. A job is represented by a directed acyclic graph without forks (intree) but with typed tasks. The execution resources are distributed and each resource can carry out a set of task types. The objective function is to minimize the makespan of the batch execution. Three algorithms are studied in this context: an on-line algorithm, a genetic algorithm and a steady-state algorithm. The contribution of this paper is on the experimental analysis of these algorithms and on their adaptation to the context. We show that their performances depend on the size of the batch and on the characteristics of the execution platform

    Processing Identical Workflows on {SOA} Grids: Comparison of Three Approaches

    No full text
    International audienceIn this paper we consider the scheduling of a batch of workflows on a service oriented grid. A job is represented by a directed acyclic graph without forks (intree) but with typed tasks. The processors are distributed and each processor have a set of services that carry out equivalent task types. The objective function is to minimize the makespan of the batch execution. Three algorithms are studied in this context: an on-line algorithm, a genetic algorithm and a steady-state algorithm. The contribution of this paper is on the experimental analysis of these algorithms and on their adaptation to the context. We show that their performances depend on the size and complexity of the batch and on the characteristics of the execution platform. end{abstract

    Fair scheduling of bag-of-tasks applications using distributed Lagrangian optimization

    Get PDF
    International audienceLarge scale distributed systems typically comprise hundreds to millions of entities (applications, users, companies, universities) that have only a partial view of resources (computers, communication links). How to fairly and efficiently share such resources between entities in a distributed way has thus become a critical question. Although not all applications are suitable for execution on large scale distributed computing platform, ideal are the Bag-of-Tasks (BoT) applications. Hence a large fraction of jobs in workloads imposed on Grids is made of sequential applications submitted in the form of BoTs. Up until now, mainly simple mechanisms have been used to ensure a fair sharing of resources among these applications. Although these mechanisms are proved to be efficient for CPU-bound applications, they are known to be ineffective in the presence of network-bound applications. A possible answer resorts to Lagrangian optimization and distributed gradient descent. Under certain conditions, the resource sharing problem can be formulated as a global optimization problem, which can be solved by a distributed self-stabilizing supply and demand algorithm. In the last decade, this technique has been applied to design various network protocols (variants of TCP, multi-path network protocols, wireless network protocols) and even distributed algorithms for smart grids. In this article, we explain how to use this technique for fairly scheduling concurrent BoT applications with arbitrary communication-to-computation ratio on a Grid. Yet, application heterogeneity raises severe convergence and stability issues that did not appear in the previous contexts and need to be addressed by non-trivial modifications. The effectiveness of our proposal is assessed through an extensive set of complex and realistic simulations

    Multi-criteria scheduling of pipeline workflows

    Get PDF
    Mapping workflow applications onto parallel platforms is a challenging problem, even for simple application patterns such as pipeline graphs. Several antagonist criteria should be optimized, such as throughput and latency (or a combination). In this paper, we study the complexity of the bi-criteria mapping problem for pipeline graphs on communication homogeneous platforms. In particular, we assess the complexity of the well-known chains-to-chains problem for different-speed processors, which turns out to be NP-hard. We provide several efficient polynomial bi-criteria heuristics, and their relative performance is evaluated through extensive simulations

    Modeling heterogeneous processor scheduling for real time systems

    Get PDF
    A new model is presented to describe dataflow algorithms implemented in a multiprocessing system. Called the resource/data flow graph (RDFG), the model explicitly represents cyclo-static processor schedules as circuits of processor arcs which reflect the order that processors execute graph nodes. The model also allows the guarantee of meeting hard real-time deadlines. When unfolded, the model identifies statically the processor schedule. The model therefore is useful for determining the throughput and latency of systems with heterogeneous processors. The applicability of the model is demonstrated using a space surveillance algorithm

    DeepSoCS: A Neural Scheduler for Heterogeneous System-on-Chip (SoC) Resource Scheduling

    Full text link
    In this paper, we~present a novel scheduling solution for a class of System-on-Chip (SoC) systems where heterogeneous chip resources (DSP, FPGA, GPU, etc.) must be efficiently scheduled for continuously arriving hierarchical jobs with their tasks represented by a directed acyclic graph. Traditionally, heuristic algorithms have been widely used for many resource scheduling domains, and Heterogeneous Earliest Finish Time (HEFT) has been a dominating state-of-the-art technique across a broad range of heterogeneous resource scheduling domains over many years. Despite their long-standing popularity, HEFT-like algorithms are known to be vulnerable to a small amount of noise added to the environment. Our Deep Reinforcement Learning (DRL)-based SoC Scheduler (DeepSoCS), capable of learning the "best" task ordering under dynamic environment changes, overcomes the brittleness of rule-based schedulers such as HEFT with significantly higher performance across different types of jobs. We~describe a DeepSoCS design process using a real-time heterogeneous SoC scheduling emulator, discuss major challenges, and present two novel neural network design features that lead to outperforming HEFT: (i) hierarchical job- and task-graph embedding; and (ii) efficient use of real-time task information in the state space. Furthermore, we~introduce effective techniques to address two fundamental challenges present in our environment: delayed consequences and joint actions. Through an extensive simulation study, we~show that our DeepSoCS exhibits the significantly higher performance of job execution time than that of HEFT with a higher level of robustness under realistic noise conditions. We~conclude with a discussion of the potential improvements for our DeepSoCS neural scheduler.Comment: 18 pages, Accepted by Electronics 202

    Semantic-Preserving Transformations for Stream Program Orchestration on Multicore Architectures

    Get PDF
    Because the demand for high performance with big data processing and distributed computing is increasing, the stream programming paradigm has been revisited for its abundance of parallelism in virtue of independent actors that communicate via data channels. The synchronous data-flow (SDF) programming model is frequently adopted with stream programming languages for its convenience to express stream programs as a set of nodes connected by data channels. Static data-rates of SDF programming model enable program transformations that greatly improve the performance of SDF programs on multicore architectures. The major application domain is for SDF programs are digital signal processing, audio, video, graphics kernels, networking, and security. This thesis makes the following three contributions that improve the performance of SDF programs: First, a new intermediate representation (IR) called LaminarIR is introduced. LaminarIR replaces FIFO queues with direct memory accesses to reduce the data communication overhead and explicates data dependencies between producer and consumer nodes. We provide transformations and their formal semantics to convert conventional, FIFO-queue based program representations to LaminarIR. Second, a compiler framework to perform sound and semantics-preserving program transformations from FIFO semantics to LaminarIR. We employ static program analysis to resolve token positions in FIFO queues and replace them by direct memory accesses. Third, a communication-cost-aware program orchestration method to establish a foundation of LaminarIR parallelization on multicore architectures. The LaminarIR framework, which consists of the aforementioned contributions together with the benchmarks that we used with the experimental evaluation, has been open-sourced to advocate further research on improving the performance of stream programming languages

    Optimizing the Cost of an Heterogeneous Distributed Platform

    No full text
    International audienceDistributed platforms become heterogeneous in more and more domains, as heterogeneous computing (HC) onto grids or reconfigurable factories in the industry. For production grids and factories, it is mandatory to control and optimize the economic cost of a such platforms regarding performance objectives. We present in this paper a study which purpose is to optimize the size of such environments depending on the workflow to execute or product to realize. The target platforms are either micro-factories, sized to manufacture products at the micrometric scale, or the heterogeneous computing domain where the key point is to reserve processors of an execution platform onto a grid to compute workflows like medical imaging applications. Thanks to the sizing of the platform, optimal or not, scheduling a workflow in HC environment or a production in the micro-factory is easy because the size of the platform already takes the performance constraints into account. In this paper, we present general results on the platform size optimization. Numerical results are also presented to illustrate 3 cases of our study
    • …
    corecore