14 research outputs found

    Measuring the dynamical state of the Internet: Large-scale network tomography via the ETOMIC infrastructure

    Get PDF
    In this paper we show how to go beyond the study of the topological properties of the Internet, by measuring its dynamical state using special active probing techniques and the methods of network tomography. We demonstrate this approach by measuring the key state parameters of Internet paths, the characteristics of queuing delay, in a part of the European Internet. In the paper we describe in detail the ETOMIC measurement platform that was used to conduct the experiments, and the applied method of queuing delay tomography. The main results of the paper are maps showing various spatial structure in the characteristics of queuing delay corresponding to the resolved part of the European Internet. These maps reveal that the average queuing delay of network segments spans more than two orders of magnitude, and that the distribution of this quantity is very well fitted by the log-normal distribution. Copyright © 2006 S. Karger AG

    Implementation of multi-layer techniques using FEDERICA, PASITO and OneLab network infrastructures

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. V. López, J. L. Añamuro, V. Moreno, J. E. L. De Vergara, J. Aracil, C. García, J. P. Fernández-Palacios, and M. Izal, "Implementation of multi-layer techniques using FEDERICA, PASITO and OneLab network infrastructures", in 17th IEEE International Conference on Networks, ICON 2011, p. 89-94This paper describes an implementation of multilayer techniques using the network infrastructure provided by FEDERICA, PASITO and OneLab projects. FEDERICA project provides a network infrastructure, based on virtualization capabilities in both network and computing resources, which creates custom-made virtual environments. PASITO is a layer- 2 network that connects universities and research centers in Spain. OneLab measurements tools allow carrying out highaccuracy active network measurements. Thanks to FEDERICA and PASITO, we have a multi-layer architecture where the traffic is routed based on the measurements of OneLab equipment. To carry out this experiment, we have developed a Multi-layer Traffic Engineering manager and an implementation of the Path Computation Element Protocol to solve the lack of a control plane in IP oriented networks. This work shows the feasibility of multilayer techniques as a convenient solution for network operators and it validates our Path Computation Element implementation.This work has been partially funded by the Spanish Ministry of Education and Science under project ANFORA (TEC2009-13385), by the Spanish Ministry of Industry, Tourism and Trade under PASITO project, and by the European Union under project OneLab2 (FP7-224263). Authors would like to thank Mauro Campanella (GARR, the project coordinator of FEDERICA) and Miguel Angel Sotos (RedIris) for their support to carry out this work

    End-to-End Available Bandwidth Estimation Tools, An Experimental Comparison

    Full text link
    Abstract. The available bandwidth of a network path impacts the per-formance of many applications, such as VoIP calls, video streaming and P2P content distribution systems. Several tools for bandwidth estimation have been proposed in the last years but there is still uncertainty in their accuracy and efficiency under different network conditions. Although a number of experimental evaluations have been carried out in order to compare some of these methods, a comprehensive evaluation of all the existing active tools for available bandwidth estimation is still missing. This article introduces an empirical comparison of most of the active esti-mation tools actually implemented and freely available nowadays. Abing, ASSOLO, DietTopp, IGI, pathChirp, Pathload, PTR, Spruce and Yaz have been compared in a controlled environment and in presence of dif-ferent sources of cross-traffic. The performance of each tool has been investigated in terms of accuracy, time and traffic injected into the net-work to perform an estimation.

    Harnessing low-level tuning in modern architectures for high-performance network monitoring in physical and virtual platforms

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Tecnología Electrónica y de las Comunicaciones. Fecha de lectura: 02-07-201

    Proactive measurement techniques for network monitoring in heterogeneous environments

    Full text link
    Tesis doctoral inédita. Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Tecnología Electrónica y de las Comunicaciones, 201

    Machine learning-based available bandwidth estimation

    Get PDF
    Today’s Internet Protocol (IP), the Internet’s network-layer protocol, provides a best-effort service to all users without any guaranteed bandwidth. However, for certain applications that have stringent network performance requirements in terms of bandwidth, it is significantly important to provide Quality of Ser- vice (QoS) guarantees in IP networks. The end-to-end available bandwidth of a network path, i.e., the residual capacity that is left over by other traffic, is deter- mined by its tight link, that is the link that has the minimal available bandwidth. The tight link may differ from the bottleneck link, i.e., the link with the minimal capacity. Passive and active measurements are the two fundamental approaches used to estimate the available bandwidth in IP networks. Unlike passive measurement tools that are based on the non-intrusive monitoring of traffic, active tools are based on the concept of self-induced congestion. The dispersion, which arises when packets traverse a network, carries information that can reveal relevant network characteristics. Using a fluid-flow probe gap model of a tight link with First-in, First-out (FIFO) multiplexing, accepted probing tools measure the packet dispersion to estimate the available bandwidth. Difficulties arise, how- ever, if the dispersion is distorted compared to the model, e.g., by non-fluid traffic, multiple tight links, clustering of packets due to interrupt coalescing and inaccurate time-stamping in general. It is recognized that modeling these effects is cumbersome if not intractable. To alleviate the variability of noise-afflicted packet gaps, the state-of-the-art bandwidth estimation techniques use post-processing of the measurement results, e.g., averaging over several packet pairs or packet trains, linear regression, or a Kalman filter. These techniques, however, do not overcome the basic as- sumptions of the deterministic fluid model. While packet trains and statistical post-processing help to reduce the variability of available bandwidth estimates, these cannot resolve systematic deviations such as the underestimation bias in case of random cross traffic and multiple tight links. The limitations of the state-of-the-art methods motivate us to explore the use of machine learning in end-to-end active and passive available bandwidth estimation. We investigate how to benefit from machine learning while using standard packet train probes for active available bandwidth estimation. To reduce the amount of required training data, we propose a regression-based scale- invariant method that is applicable without prior calibration to networks of arbitrary capacity. To reduce the amount of probe traffic further, we implement a neural network that acts as a recommender and can effectively select the probe rates that reduce the estimation error most quickly. We also evaluate our method with other regression-based supervised machine learning techniques. Furthermore, we propose two different multi-class classification-based meth- ods for available bandwidth estimation. The first method employs reinforcement learning that learns through the network path’s observations without having a training phase. We formulate the available bandwidth estimation as a single-state Markov Decision Process (MDP) multi-armed bandit problem and implement the ε-greedy algorithm to find the available bandwidth, where ε is a parameter that controls the exploration vs. exploitation trade-off. We propose another supervised learning-based classification method to ob- tain reliable available bandwidth estimates with a reduced amount of network overhead in networks, where available bandwidth changes very frequently. In such networks, reinforcement learning-based method may take longer to con- verge as it has no training phase and learns in an online manner. We also evaluate our method with different classification-based supervised machine learning techniques. Furthermore, considering the correlated changes in a network’s traffic through time, we apply filtering techniques on the estimation results in order to track the available bandwidth changes. Active probing techniques provide flexibility in designing the input struc- ture. In contrast, the vast majority of Internet traffic is Transmission Control Protocol (TCP) flows that exhibit a rather chaotic traffic pattern. We investigate how the theory of active probing can be used to extract relevant information from passive TCP measurements. We extend our method to perform the estima- tion using only sender-side measurements of TCP data and acknowledgment packets. However, non-fluid cross traffic, multiple tight links, and packet loss in the reverse path may alter the spacing of acknowledgments and hence in- crease the measurement noise. To obtain reliable available bandwidth estimates from noise-afflicted acknowledgment gaps we propose a neural network-based method. We conduct a comprehensive measurement study in a controlled network testbed at Leibniz University Hannover. We evaluate our proposed methods under a variety of notoriously difficult network conditions that have not been included in the training such as randomly generated networks with multiple tight links, heavy cross traffic burstiness, delays, and packet loss. Our testing results reveal that our proposed machine learning-based techniques are able to identify the available bandwidth with high precision from active and passive measurements. Furthermore, our reinforcement learning-based method without any training phase shows accurate and fast convergence to available band- width estimates

    Trade-off among timeliness, messages and accuracy for large-Ssale information management

    Get PDF
    The increasing amount of data and the number of nodes in large-scale environments require new techniques for information management. Examples of such environments are the decentralized infrastructures of Computational Grid and Computational Cloud applications. These large-scale applications need different kinds of aggregated information such as resource monitoring, resource discovery or economic information. The challenge of providing timely and accurate information in large scale environments arise from the distribution of the information. Reasons for delays in distributed information system are a long information transmission time due to the distribution, churn and failures. A problem of large applications such as peer-to-peer (P2P) systems is the increasing retrieval time of the information due to the decentralization of the data and the failure proneness. However, many applications need a timely information provision. Another problem is an increasing network consumption when the application scales to millions of users and data. Using approximation techniques allows reducing the retrieval time and the network consumption. However, the usage of approximation techniques decreases the accuracy of the results. Thus, the remaining problem is to offer a trade-off in order to solve the conflicting requirements of fast information retrieval, accurate results and low messaging cost. Our goal is to reach a self-adaptive decision mechanism to offer a trade-off among the retrieval time, the network consumption and the accuracy of the result. Self-adaption enables distributed software to modify its behavior based on changes in the operating environment. In large-scale information systems that use hierarchical data aggregation, we apply self-adaptation to control the approximation used for the information retrieval and reduces the network consumption and the retrieval time. The hypothesis of the thesis is that approximation techniquescan reduce the retrieval time and the network consumption while guaranteeing an accuracy of the results, while considering user’s defined priorities. First, this presented research addresses the problem of a trade-off among a timely information retrieval, accurate results and low messaging cost by proposing a summarization algorithm for resource discovery in P2P-content networks. After identifying how summarization can improve the discovery process, we propose an algorithm which uses a precision-recall metric to compare the accuracy and to offer a user-driven trade-off. Second, we propose an algorithm that applies a self-adaptive decision making on each node. The decision is about the pruning of the query and returning the result instead of continuing the query. The pruning reduces the retrieval time and the network consumption at the cost of a lower accuracy in contrast to continuing the query. The algorithm uses an analytic hierarchy process to assess the user’s priorities and to propose a trade-off in order to satisfy the accuracy requirements with a low message cost and a short delay. A quantitative analysis evaluates our presented algorithms with a simulator, which is fed with real data of a network topology and the nodes’ attributes. The usage of a simulator instead of the prototype allows the evaluation in a large scale of several thousands of nodes. The algorithm for content summarization is evaluated with half a million of resources and with different query types. The selfadaptive algorithm is evaluated with a simulator of several thousands of nodes that are created from real data. A qualitative analysis addresses the integration of the simulator’s components in existing market frameworks for Computational Grid and Cloud applications. The proposed content summarization algorithm reduces the information retrieval time from a logarithmic increase to a constant factor. Furthermore, the message size is reduced significantly by applying the summarization technique. For the user, a precision-recall metric allows defining the relation between the retrieval time and the accuracy. The self-adaptive algorithm reduces the number of messages needed from an exponential increase to a constant factor. At the same time, the retrieval time is reduced to a constant factor under an increasing number of nodes. Finally, the algorithm delivers the data with the required accuracy adjusting the depth of the query according to the network conditions.La gestió de la informació exigeix noves tècniques que tractin amb la creixent quantitat de dades i nodes en entorns a gran escala. Alguns exemples d’aquests entorns són les infraestructures descentralitzades de Computacional Grid i Cloud. Les aplicacions a gran escala necessiten diferents classes d’informació agregada com monitorització de recursos i informació econòmica. El desafiament de proporcionar una provisió ràpida i acurada d’informació en ambients de grans escala sorgeix de la distribució de la informació. Una raó és que el sistema d’informació ha de tractar amb l’adaptabilitat i fracassos d’aquests ambients. Un problema amb aplicacions molt grans com en sistemes peer-to-peer (P2P) és el creixent temps de recuperació de l’informació a causa de la descentralització de les dades i la facilitat al fracàs. No obstant això, moltes aplicacions necessiten una provisió d’informació puntual. A més, alguns usuaris i aplicacions accepten inexactituds dels resultats si la informació es reparteix a temps. A més i més, el consum de xarxa creixent fa que sorgeixi un altre problema per l’escalabilitat del sistema. La utilització de tècniques d’aproximació permet reduir el temps de recuperació i el consum de xarxa. No obstant això, l’ús de tècniques d’aproximació disminueix la precisió dels resultats. Així, el problema restant és oferir un compromís per resoldre els requisits en conflicte d’extracció de la informació ràpida, resultats acurats i cost d’enviament baix. El nostre objectiu és obtenir un mecanisme de decisió completament autoadaptatiu per tal d’oferir el compromís entre temps de recuperació, consum de xarxa i precisió del resultat. Autoadaptacío permet al programari distribuït modificar el seu comportament en funció dels canvis a l’entorn d’operació. En sistemes d’informació de gran escala que utilitzen agregació de dades jeràrquica, l’auto-adaptació permet controlar l’aproximació utilitzada per a l’extracció de la informació i redueixen el consum de xarxa i el temps de recuperació. La hipòtesi principal d’aquesta tesi és que els tècniques d’aproximació permeten reduir el temps de recuperació i el consum de xarxa mentre es garanteix una precisió adequada definida per l’usari. La recerca que es presenta, introdueix un algoritme de sumarització de continguts per a la descoberta de recursos a xarxes de contingut P2P. Després d’identificar com sumarització pot millorar el procés de descoberta, proposem una mètrica que s’utilitza per comparar la precisió i oferir un compromís definit per l’usuari. Després, introduïm un algoritme nou que aplica l’auto-adaptació a un ordre per satisfer els requisits de precisió amb un cost de missatge baix i un retard curt. Basat en les prioritats d’usuari, l’algoritme troba automàticament un compromís. L’anàlisi quantitativa avalua els algoritmes presentats amb un simulador per permetre l’evacuació d’uns quants milers de nodes. El simulador s’alimenta amb dades d’una topologia de xarxa i uns atributs dels nodes reals. L’algoritme de sumarització de contingut s’avalua amb mig milió de recursos i amb diferents tipus de sol·licituds. L’anàlisi qualitativa avalua la integració del components del simulador en estructures de mercat existents per a aplicacions de Computacional Grid i Cloud. Així, la funcionalitat implementada del simulador (com el procés d’agregació i la query language) és comprovada per la integració de prototips. L’algoritme de sumarització de contingut proposat redueix el temps d’extracció de l’informació d’un augment logarítmic a un factor constant. A més, també permet que la mida del missatge es redueix significativament. Per a l’usuari, una precision-recall mètric permet definir la relació entre el nivell de precisió i el temps d’extracció de la informació. Alhora, el temps de recuperació es redueix a un factor constant sota un nombre creixent de nodes. Finalment, l’algoritme reparteix les dades amb la precisió exigida i ajusta la profunditat de la sol·licitud segons les condicions de xarxa. Els algoritmes introduïts són prometedors per ser utilitzats per l’agregació d’informació en nous sistemes de gestió de la informació de gran escala en el futur.Postprint (published version
    corecore