27 research outputs found

    Bedibe: Datasets and Software Tools for Distributed Bandwidth Prediction

    Get PDF
    National audiencePouvoir prédire la bande passante disponible est une problématique cruciale pour un grand nombre d'applications distribuées sur Internet. Plusieurs solutions ont été proposées, mais l'absence d'implémentations communes et de jeux de données reconnus rend difficile la comparaison et la reproductibilité des résultats. Dans cet article, nous présentons bedibe, la combinaison de mesures de bande passante effectuées sur Planet-Lab et d'un logiciel pour faciliter l'écriture et l'étude d'algorithmes pour la prédiction de bande passante. bedibe inclut les implémentations des meilleures solutions de la littérature, et a pour but de faciliter la comparaison des résultats obtenus par les différentes équipes qui travaillent sur ce thÚme

    Generalizing Kronecker graphs in order to model searchable networks

    Get PDF
    This paper describes an extension to stochastic Kronecker graphs that provides the special structure required for searchability, by defining a “distance”-dependent Kronecker operator. We show how this extension of Kronecker graphs can generate several existing social network models, such as the Watts-Strogatz small-world model and Kleinberg’s latticebased model. We focus on a specific example of an expanding hypercube, reminiscent of recently proposed social network models based on a hidden hyperbolic metric space, and prove that a greedy forwarding algorithm can find very short paths of length O((log log n)^2) for graphs with n nodes

    Using the Last-mile Model as a Distributed Scheme for Available Bandwidth Prediction

    Get PDF
    International audienceSeveral Network Coordinate Systems have been proposed to predict unknown network distances between a large number of Internet nodes by using only a small number of measurements. These systems focus on predicting latency, and they are not adapted to the prediction of available bandwidth. But end-to-end path available bandwidth is an important metric for the performance optimisation in many high throughput distributed applications, such as video streaming and le sharing networks. In this paper, we propose to perform available bandwidth prediction with the last-mile model, in which each node is characterised by its incoming and outgoing capacities. This model has been used in several theoretical works for distributed applications. We design decentralised heuristics to compute the capacities of each node so as to minimise the prediction error. We show that our algorithms can achieve a competitive accuracy even with asymmetric and erroneous end-to-end measurement datasets. A comparison with existing models (Vivaldi, Sequoia, PathGuru, DMF) is provided. Simulation results also show that our heuristics can provide good quality predictions even when using a very small number of measurements

    Point-to-point and congestion bandwidth estimation: experimental evaluation on PlanetLab

    Get PDF
    In large scale Internet platforms, measuring the available bandwidth between nodes of the platform is difficult and costly. However, having access to this information allows to design clever algorithms to optimize resource usage for some collective communications, like broadcasting a message or organizing master/slave computations. In this paper, we analyze the feasibility to provide estimations, based on a limited number of measurements, for the point-to-point available bandwidth values, and for the congestion which happens when several communications take place at the same time. We present a dataset obtained with both types of measurements performed on a set of nodes from the PlanetLab platform. We show that matrix factorization techniques are quite efficient at predicting point-to-point available bandwidth, but are not adapted for congestion analysis. However, a LastMile modeling of the platform allows to perform congestion predictions with a reasonable level of accuracy, even with a small amount of information, despite the variability of the measured platform

    ClouDiA: a deployment advisor for public clouds

    Get PDF
    An increasing number of distributed data-driven applications are moving into shared public clouds. By sharing resources and oper-ating at scale, public clouds promise higher utilization and lower costs than private clusters. To achieve high utilization, however, cloud providers inevitably allocate virtual machine instances non-contiguously, i.e., instances of a given application may end up in physically distant machines in the cloud. This allocation strategy can lead to large differences in average latency between instances. For a large class of applications, this difference can result in signif-icant performance degradation, unless care is taken in how applica-tion components are mapped to instances. In this paper, we propose ClouDiA, a general deployment ad-visor that selects application node deployments minimizing either (i) the largest latency between application nodes, or (ii) the longest critical path among all application nodes. ClouDiA employs mixed-integer programming and constraint programming techniques to ef-ficiently search the space of possible mappings of application nodes to instances. Through experiments with synthetic and real applica-tions in Amazon EC2, we show that our techniques yield a 15 % to 55 % reduction in time-to-solution or service response time, without any need for modifying application code. 1

    Decentralized Prediction of End-to-End Network Performance Classes

    Full text link
    In large-scale networks, full-mesh active probing of end-to-end performance metrics is infeasible. Measuring a small set of pairs and predicting the others is more scalable. Under this framework, we formulate the prediction problem as matrix completion, whereby unknown entries of an incomplete matrix of pairwise measurements are to be predicted. This problem can be solved by matrix factorization because performance matrices have a low rank, thanks to the correlations among measurements. Moreover, its resolution can be fully decentralized without actually building matrices nor relying on special landmarks or central servers. In this paper we demonstrate that this approach is also applicable when the performance values are not measured exactly, but are only known to belong to one among some predefined performance classes, such as "good" and "bad". Such classification-based formulation not only fulfills the requirements of many Internet applications but also reduces the measurement cost and enables a unified treatment of various performance metrics. We propose a decentralized approach based on Stochastic Gradient Descent to solve this class-based matrix completion problem. Experiments on various datasets, relative to two kinds of metrics, show the accuracy of the approach, its robustness against erroneous measurements and its usability on peer selection.Peer reviewe

    Utilisation d'outils de plongement d'Internet pour l'agrégation de ressources hétérogÚnes

    Get PDF
    International audienceDans cet article, nous nous intĂ©ressons aux plates-formes de grande Ă©chelle comme BOINC, qui sont constituĂ©es d'un ensemble de ressources hĂ©tĂ©rogĂšnes utilisant Internet comme rĂ©seau de communication. Dans ce contexte, nous Ă©tudions un problĂšme d'agrĂ©gation de ressources dans lequel l'objectif est de construire des groupes de ressources, de telle sorte que chaque groupe ait une capacitĂ© totale supĂ©rieure Ă  une certaine valeur, et tels qu'au sein d'un mĂȘme groupe, deux ressources ne soient pas trop Ă©loignĂ©es (en terme de latence) l'une de l'autre. Dans de telles plateformes, il n'est pas rĂ©aliste de supposer connaĂźtre la latence pour l'intĂ©gralitĂ© des couples de noeuds. Il est donc nĂ©cessaire d'avoir recours Ă  des outils de plongement comme Vivaldi ou Sequoia. Ces outils permettent de travailler dans des espaces mĂ©triques spĂ©cifiques et dans lesquels la distance entre deux noeuds peut ĂȘtre obtenue directement Ă  partir d'une petite quantitĂ© d'informations disponible Ă  chaque noeud. Nous Ă©tudions le problĂšme "Bin Covering" avec Contrainte de Distance (BCCD) et utilisons des algorithmes dĂ©diĂ©s dans les espaces mĂ©triques induits par plusieurs outils de plongement pour proposer une comparaison de ces algorithmes en nous appuyant sur des mesures de latences rĂ©elles. Cette comparaison nous permet de dĂ©cider quel couple (algorithme,outil de plongement) est en pratique le plus efficace pour ce problĂšme d'agrĂ©gation de ressources
    corecore