2,296 research outputs found

    Genet: A Quickly Scalable Fat-Tree Overlay for Personal Volunteer Computing using WebRTC

    Full text link
    WebRTC enables browsers to exchange data directly but the number of possible concurrent connections to a single source is limited. We overcome the limitation by organizing participants in a fat-tree overlay: when the maximum number of connections of a tree node is reached, the new participants connect to the node's children. Our design quickly scales when a large number of participants join in a short amount of time, by relying on a novel scheme that only requires local information to route connection messages: the destination is derived from the hash value of the combined identifiers of the message's source and of the node that is holding the message. The scheme provides deterministic routing of a sequence of connection messages from a single source and probabilistic balancing of newer connections among the leaves. We show that this design puts at least 83% of nodes at the same depth as a deterministic algorithm, can connect a thousand browser windows in 21-55 seconds in a local network, and can be deployed for volunteer computing to tap into 320 cores in less than 30 seconds on a local network to increase the total throughput on the Collatz application by two orders of magnitude compared to a single core

    ULC Minutes, May 9, 2022

    Get PDF
    Notes from ULC Meeting of May 9, 202

    Cosmological Simulations on a Grid of Computers

    Get PDF
    The work presented in this paper aims at restricting the input parameter values of the semi-analytical model used in GALICS and MOMAF, so as to derive which parameters influence the most the results, e.g., star formation, feedback and halo recycling efficiencies, etc. Our approach is to proceed empirically: we run lots of simulations and derive the correct ranges of values. The computation time needed is so large, that we need to run on a grid of computers. Hence, we model GALICS and MOMAF execution time and output files size, and run the simulation using a grid middleware: DIET. All the complexity of accessing resources, scheduling simulations and managing data is harnessed by DIET and hidden behind a web portal accessible to the users.Comment: Accepted and Published in AIP Conference Proceedings 1241, 2010, pages 816-82

    Impact of Mixed--Parallelism on Parallel Implementations of Strassen and Winograd Matrix Multiplication Algorithms

    Get PDF
    In this paper we study the impact of the simultaneous exploitation of data-- and task--parallelism on Strassen and Winograd matrix multiplication algorithms. We present two mixed--parallel implementations. The former follows the phases of the original algorithms while the latter has been designed as the result of a list scheduling algorithm. We give a theoretical comparison- , in terms of memory usage and execution time, between our algorithms and classical data--parallel implementations. This analysis is corroborated by experiments. Finally we give some hints about an heterogeneous version of our algorithms

    A Bi-Criteria Algorithm for Scheduling Parallel Task Graphs on Clusters

    Get PDF
    International audienceApplications structured as parallel task graphs exhibit both data and task parallelism, and arise in many domains. Scheduling these applications on parallel platforms has been a long-standing challenge. In the case of a single homogeneous cluster, most of the existing algorithms focus on the reduction of the application completion time (makespan). But in presence of resource managers such as batch schedulers and due to accentuated pressure on energy concerns, the produced schedules also have to be efficient in terms of resource usage. In this paper we propose a novel bi-criteria algorithm, called biCPA, able to optimize these two performance metrics either simultaneously or separately. Using simulation over a wide range of experimental scenarios, we find that biCPA leads to better results than previously published algorithms

    On the sustainability of large-scale computer science testbeds: the Grid'5000 case

    No full text
    In this position paper, we look at the financial sustainability of Grid'5000. The duration of the project (over 12 years) owesmore to successive investment decisions and continued support rather than from a wellunderstood and operated business model generating enough revenue tocovers costs and investments.In this paper, we will give an overview of a typical coststructure for a large-scale testbed, wesummarize with a few statements our views and develop pros and cons of a few funding sources. The way Grid'5000 is funded is detailed, before giving some data used to compute the unit cost for Grid'5000 resources

    Dynamic Scheduling of MapReduce Shuffle under Bandwidth Constraints

    Get PDF
    Whether it is for e-science or business, the amount of data produced every year is growing at a high rate. Managing and processing those data raises new challenges. MapReduce is one answer to the need for scalable tools able to handle the amount of data. It imposes a general structure of computation and let the implementation perform its optimizations. During the computation, there is a phase called Shuffle where every node sends a possibly large amount of data to every other node. This report proposes and evaluates six algorithms to improve data transfers during the Shuffle phase under bandwidth constraints.Que ce soit pour l’e-science ou pour les affaires, la quantité de données produites chaque année augmente à une vitesse vertigineuse. Gérer et traiter ces données soulève de nouveaux défis. MapReduce est l’une des réponses aux besoins d’outils qui passent à l’échelle et capables de gérer ces volumes de données. Il impose une structure générale de calcul et laisse l’implémentation effectuer ses optimisations. Durant l’une des phases du calcul appelée Shuffle, tous les nœuds envoient des données potentiellement grosses à tous les autres nœuds. Ce rapport propose et évalue six algorithmes pour améliorer le transfert des données durant cette phase de Shuffle sous des contraintes de bande passante

    Simultaneous Scheduling of Replication and Computation for Data-Intensive Applications on the Grid

    Get PDF
    One of the first motivations of using grids comes from applications managing large data sets like for example in High Energy Physic or Life Sciences. To improve the global throughput of software environments, replicas are usually put at wisely selected sites. Moreover, computation requests have to be scheduled among the available resources. To get the best performance, scheduling and data replication have to be tightly coupled which is not always the case in existing approaches. This paper presents an algorithm that combines data management and scheduling at the same time using a steady-state approach. Our theoretical results are validated using simulation and logs from a large life science application (ACI GRID GriPPS).L'une des principales motivations pour utiliser les grilles de calcul vient des applications utilisant de larges ensembles de données comme, par exemple, en Physique des Hautes Energies ou en Science de la Vie. Pour améliorer le rendement global des environnements logiciels utilisées pour porter ces applications sur les grilles, des réplicats des données sont déposées sur différents sites sélectionnés. De plus es requêtes de calcul doivent être ordonnancées en tenant compte des ressources disponibles. Pour obtenir de meilleures performances, l'ordonnancement des requêtes et la réplication des données doivent être étroitement couplés ce qui n'est généralement pas le cas dans les approches existantes. Cet article présente un algorithme qui combine la gestion des données et l'ordonnancement en utilisant une approche en régime permanent. Nos résultats théoriques sont validés par simulations et par l'utilisation des traces d'un serveur de calcul d'application de Sciences de la Vie(ACIGRIDGRIPPS)

    SimGrid Cloud Broker: Simulating the Amazon AWS Cloud

    Get PDF
    Validating a new application over a Cloud is not an easy task and it can be costly over public Clouds. Simulation is a good solution if the simulator is accurate enough and if it provides all the features of the target Cloud. In this report, we propose an extension of the SimGrid simulation toolkit to simulate the Amazon IaaS Cloud. Based on an extensive study of the Amazon platform and previous evaluations, we integrate models into the SimGrid Cloud Broker and expose the same API as Amazon to the users. Our experimental results show that our simulator is able to simulate different parts of Amazon for different applications.La validation d'une nouvelle application sur un Cloud n'est pas une tâche facile et elle peut être coûteuse sur des Clouds publiques. La simulation reste une bonne solution si le simulateur est suffisament précis et qu'il fournit toutes les fonctionnalités du Cloud cible. Dans ce rapport, nous proposons une extension de l'outil de simulation SimGrid pour simuler le Cloud publique Amazon. En nous basant sur une étude extensive de la plate-forme Amazon et sur des évaluations précédentes, nous intégrons les modèles dans le SimGrid Cloud Broker et proposons la même API qu'Amazon. Nos résultats expérimentaux montrent que notre simulateur est capable de simuler les différents éléments du Cloud Amazon pour différentes applications
    • …
    corecore