4 research outputs found

    Scheduling Associative Reductions with Homogeneous Costs when Overlapping Communications and Computations

    Get PDF
    Reduction is a core operation in parallel computing. Optimizing its cost has a high potential impact on the application execution time, particularly in MPI and MapReduce computations. In this paper, we propose an optimal algorithm for scheduling associative reductions. We focus on the case where communications and computations can be overlapped to fully exploit resources. Our algorithm greedily builds a spanning tree by starting from the sink and by adding a parent at each iteration. Bounds on the completion time of optimal schedules are then characterized. To show the algorithm extensibility, we adapt it to model variations in which either communication or computation resources are limited. Moreover, we study two specific spanning trees: while the binomial tree is optimal when there is either no transfer or no computation, the Fibonacci tree is optimal when the transfer cost is equal to the computation cost. Finally, approximation ratios of strategies that are derived from those trees are drawn.L'opération de réduction est centrale au calcul parallÚle. Optimiser son coût peut avoir un fort impact sur le temps d'exécution d'une application, en particulier dans le cas de MPI ou de MapReduce. Dans ce rapport, nous proposons une solution optimale pour ordonnancer des réductions associatives. Nous considérons que les communications et les calculs peuvent se recouvrir afin d'exploiter pleinement les ressources. Notre algorithme construit gloutonnement un arbre couvrant en commençant par le puits et en rajoutant un parent à chaque itération. Des bornes sur les temps d'exécution d'ordonnancements optimaux sont ensuite caractérisées. Pour montrer l'extensibilité de l'algorithme, nous l'adaptons à des variations du modÚles dans lesquelles les communications ou les calculs sont limités. D'autre part, nous étudions deux arbres couvrants spécifiques: tandis que l'arbre binomial est optimal lorsqu'il n'y a soit aucun calcul, soit aucune communication, l'arbre de Fibonacci est optimal lorsque les temps de transfert et les temps de calcul sont égaux. Finalement, les facteurs d'approximation des stratégies dérivées de ces arbres sont déterminés

    Optimizing the steady-state throughput of scatter and reduce operations on heterogeneous platforms

    Get PDF
    International audienceIn this paper, we consider the communications involved by the execution of a complex application, deployed on a heterogeneous large-scale distributed platform. Such applications intensively use collective macro-communication schemes, such as scatters, personalized all-to-alls or gather/reduce operations. Rather than aiming at minimizing the execution time of a single macro-communication, we focus on the steady-state operation. We assume that there is a large number of macro-communications to perform in pipeline fashion, and we aim at maximizing the throughput, i.e., the (rational) number of macro-communications which can be initiated every time-step. We target heterogeneous platforms, modeled by a graph where resources have different communication and computation speeds. The situation is simpler for series of scatters or personalized all-to-alls than for series of reduces operations, because of the possibility of combining various partial reductions of the local values, and of interleaving computations with communications. In all cases, we show how to determine the optimal throughput, and how to exhibit a concrete periodic schedule that achieves this throughput
    corecore