4 research outputs found

    On the Importance of Bandwidth Control Mechanisms for Scheduling on Large Scale Heterogeneous Platforms

    Get PDF
    International audienceWe study three scheduling problems (file redistribution, independent tasks scheduling and broadcasting) on large scale heterogeneous platforms under the Bounded Multi-port Model. In this model, each node is associated to an incoming and outgoing bandwidth and it can be involved in an arbitrary number of communications, provided that neither its incoming nor its outgoing bandwidths are exceeded. This model well corresponds to modern networking technologies, it can be used when programming at TCP level and is also implemented in modern message passing libraries such as MPICH2. We prove, using the three above mentioned scheduling problems, that this model is tractable and that even very simple distributed algorithms can achieve optimal performance, provided that we can enforce bandwidth sharing policies. Our goal is to assert the necessity of such QoS mechanisms, that are now available in the kernels of modern operating systems, to achieve optimal performance. We prove that implementations of optimal algorithms that do not enforce prescribed bandwidth sharing can fail by a large amount if TCP contention mechanisms only are used. More precisely, for each considered scheduling problem, we establish upper bounds on the performance loss than can be induced by TCP bandwidth sharing mechanisms, we prove that these upper bounds are tight by exhibiting instances achieving them and we provide a set of simulations using SimGRID to analyze the practical impact of bandwidth control mechanisms

    Communications collectives et ordonnancement en régime permanent sur plates-formes hétérogènes

    Get PDF
    The results presented in this document concern scheduling forlarge-scale heterogeneous platforms. We mainly focus on collectivecommunications taking place during the execution of distributedapplications: broadcast, scatter or reduction of data for instance. Westudy the steady-state operation of these communications and we aim atmaximizing the throughput of a series of similar communications. Thegoal is also to obtain an asymptotically optimal schedule as formakespan minimization. We present a general framework to study thesecommunications, which we use to assess the complexity of the problem foreach communication primitive. For a particular model of communication,the bidirectional one-port model, we develop a practical for solvingthe problem, based on a linear program in rational numbers. This resultsare illustrated by experiments on the Grid5000 platform. The study ofsteady-state operations is extended to scheduling multipleapplications on computing grids.Les travaux présentés dans cette thèse concernent l'ordonnancementpour les plates-formes hétérogènes à grande échelle. Nous nousintéressons principalement aux opérations de communicationscollectives comme la diffusion de données, la distribution de donnéesou la réduction. Nous étudions ces problèmes dans le cadre de leurrégime permanent, en optimisant le débit d'une série d'opérations decommunications, en vue d'obtenir un ordonnancement asymptotiquementoptimal du point de vue du temps d'exécution total. Après avoirprésenté un cadre général d'étude qui nous permet de connaître lacomplexité du problème pour chaque primitive, nous développons, pourle modèle de communication un-port bidirectionnel, une méthode derésolution pratique fondée sur la résolution d'un programme linéaireen rationnels. Cette étude du régime permanent est illustrée par desexpérimentations sur Grid5000 et se prolonge vers l'ordonnancementd'applications multiples sur des grilles de calcul

    A Pipelined Broadcast for Multidimensional Meshes

    No full text
    We address the problem of performing a pipelined broadcast on a mesh architecture. Meshes require a different approach than other topologies, and their very nature puts a tighter bound on the performance that one can hope to achieve. By using the appropriate techniques, however, one can obtain excellent performance for sufficiently long messages. The resulting algorithm will work on meshes of any dimension with any number of nodes. Our model assumes that the mesh is a torus and/or that it has bidirectional links and uses wormhole routing. Performance data from the Cray T3D are included. Keywords: broadcast, pipelining, communication, mesh, torus 1. Introduction The broadcast is a fundamental routine in any communication library. As a result, its implementation should be as efficient as possible. To that end, no single algorithm will suffice: A routine which minimizes the number of message operations will perform well for short messages, when start-up latencies are the primary time c..

    A Pipelined Broadcast For Multidimensional Meshes

    No full text
    We address the problem of performing a pipelined broadcast on a mesh architecture. Meshes require a different approach than other topologies, and their very nature puts a tighter bound on the performance that one can hope to achieve. By using the appropriate techniques, however, one can obtain excellent performance for sufficiently long messages. The resulting algorithm will work on meshes of any dimension with any number of nodes. Our model assumes that the mesh is a torus and/or that it has bidirectional links and uses wormhole routing. Performance data from the Cray T3D are included. Keywords: broadcast, pipelining, communication, mesh, torus 1. Introduction The broadcast is a fundamental routine in any communication library. As a result, its implementation should be as efficient as possible. To that end, no single algorithm will suffice: A routine which minimizes the number of message operations will perform well for short messages, when start-up latencies are the primary time c..
    corecore