59,832 research outputs found
The Quest for Bandwidth Estimation Techniques for large-scale Distributed Systems
In recent years the research community has developed many techniques to estimate the end-to-end available bandwidth of an Internet path. This important metric has been proposed for use in several distributed systems and, more recently, has even been considered to improve the congestion control mechanism of TCP. Thus, it has been suggested that some existing estimation techniques could be used for this purpose. However, existing tools were not designed for large-scale deployments and were mostly validated in controlled settings, considering only one measurement running at a time. In this paper, we argue that current tools, while offering good estimates when used alone, might not work in large-scale systems where several estimations severely interfere with each other. We analyze the properties of the measurement paradigms employed today and discuss their functioning, study their overhead and analyze their interference. Our testbed results show that current techniques are insufficient as they are. Finally, we will discuss and propose some principles that should be taken into account for including available bandwidth measurements in large-scale distributed systems. 1
Overhead in Available Bandwidth Estimation Tools: Evaluation and Analysis
The current Available Bandwidth Estimation Tools (ABET's) to perform an estimation, using probes packets are inserted into the network. The utilization These packages, makes ABET's are intrusive and consumes part of which is measuring bandwidth to noise known as "Overhead Estimation Tools" (OET); it’s can produce negative effects on measurements performed by the ABET. This paper presents a complete and comparative analysis of behavior of Available Bandwidth (av_bw), of the ABET's most representative, as well as: Abing, Diettopp, Pathload, PathChirp, Traceband, IGI, PTR, Assolo and Wbest. The study with real Internet traffic, shows the percentage of test that is a factor packets affecting two main aspects of the estimation. The first, the accuracy, and increased indicating that EOT is directly proportional to the percentage of RE, reaching up to 70% in the tool evaluated with most of 30% of Cross-Traffic (CT). And second, the techniques used to send probes packets highly influences the Estimation Time (ET), where some tools that use slops spend up to 240s to converge when there is 60% CT in the network, ensuring that the estimate this technique av_bw highly congested channel, OET as much is used, resulting in inaccuracies in measurement
Machine learning-based available bandwidth estimation
Today’s Internet Protocol (IP), the Internet’s network-layer protocol, provides
a best-effort service to all users without any guaranteed bandwidth. However,
for certain applications that have stringent network performance requirements
in terms of bandwidth, it is significantly important to provide Quality of Ser-
vice (QoS) guarantees in IP networks. The end-to-end available bandwidth of a
network path, i.e., the residual capacity that is left over by other traffic, is deter-
mined by its tight link, that is the link that has the minimal available bandwidth.
The tight link may differ from the bottleneck link, i.e., the link with the minimal
capacity.
Passive and active measurements are the two fundamental approaches used
to estimate the available bandwidth in IP networks. Unlike passive measurement tools that are based on the non-intrusive monitoring of traffic, active tools
are based on the concept of self-induced congestion. The dispersion, which
arises when packets traverse a network, carries information that can reveal relevant network characteristics. Using a fluid-flow probe gap model of a tight link
with First-in, First-out (FIFO) multiplexing, accepted probing tools measure the
packet dispersion to estimate the available bandwidth. Difficulties arise, how-
ever, if the dispersion is distorted compared to the model, e.g., by non-fluid
traffic, multiple tight links, clustering of packets due to interrupt coalescing
and inaccurate time-stamping in general. It is recognized that modeling these
effects is cumbersome if not intractable.
To alleviate the variability of noise-afflicted packet gaps, the state-of-the-art
bandwidth estimation techniques use post-processing of the measurement results, e.g., averaging over several packet pairs or packet trains, linear regression,
or a Kalman filter. These techniques, however, do not overcome the basic as-
sumptions of the deterministic fluid model. While packet trains and statistical
post-processing help to reduce the variability of available bandwidth estimates,
these cannot resolve systematic deviations such as the underestimation bias
in case of random cross traffic and multiple tight links. The limitations of the
state-of-the-art methods motivate us to explore the use of machine learning in
end-to-end active and passive available bandwidth estimation.
We investigate how to benefit from machine learning while using standard packet train probes for active available bandwidth estimation. To reduce
the amount of required training data, we propose a regression-based scale-
invariant method that is applicable without prior calibration to networks of arbitrary capacity. To reduce the amount of probe traffic further, we implement
a neural network that acts as a recommender and can effectively select the
probe rates that reduce the estimation error most quickly. We also evaluate our
method with other regression-based supervised machine learning techniques.
Furthermore, we propose two different multi-class classification-based meth-
ods for available bandwidth estimation. The first method employs reinforcement learning that learns through the network path’s observations without
having a training phase. We formulate the available bandwidth estimation as a
single-state Markov Decision Process (MDP) multi-armed bandit problem and
implement the ε-greedy algorithm to find the available bandwidth, where ε is
a parameter that controls the exploration vs. exploitation trade-off.
We propose another supervised learning-based classification method to ob-
tain reliable available bandwidth estimates with a reduced amount of network
overhead in networks, where available bandwidth changes very frequently. In
such networks, reinforcement learning-based method may take longer to con-
verge as it has no training phase and learns in an online manner. We also evaluate our method with different classification-based supervised machine learning techniques. Furthermore, considering the correlated changes in a network’s
traffic through time, we apply filtering techniques on the estimation results in
order to track the available bandwidth changes.
Active probing techniques provide flexibility in designing the input struc-
ture. In contrast, the vast majority of Internet traffic is Transmission Control
Protocol (TCP) flows that exhibit a rather chaotic traffic pattern. We investigate
how the theory of active probing can be used to extract relevant information
from passive TCP measurements. We extend our method to perform the estima-
tion using only sender-side measurements of TCP data and acknowledgment
packets. However, non-fluid cross traffic, multiple tight links, and packet loss
in the reverse path may alter the spacing of acknowledgments and hence in-
crease the measurement noise. To obtain reliable available bandwidth estimates
from noise-afflicted acknowledgment gaps we propose a neural network-based
method.
We conduct a comprehensive measurement study in a controlled network
testbed at Leibniz University Hannover. We evaluate our proposed methods
under a variety of notoriously difficult network conditions that have not been
included in the training such as randomly generated networks with multiple
tight links, heavy cross traffic burstiness, delays, and packet loss. Our testing
results reveal that our proposed machine learning-based techniques are able to
identify the available bandwidth with high precision from active and passive
measurements. Furthermore, our reinforcement learning-based method without any training phase shows accurate and fast convergence to available band-
width estimates
Large scale probabilistic available bandwidth estimation
The common utilization-based definition of available bandwidth and many of
the existing tools to estimate it suffer from several important weaknesses: i)
most tools report a point estimate of average available bandwidth over a
measurement interval and do not provide a confidence interval; ii) the commonly
adopted models used to relate the available bandwidth metric to the measured
data are invalid in almost all practical scenarios; iii) existing tools do not
scale well and are not suited to the task of multi-path estimation in
large-scale networks; iv) almost all tools use ad-hoc techniques to address
measurement noise; and v) tools do not provide enough flexibility in terms of
accuracy, overhead, latency and reliability to adapt to the requirements of
various applications. In this paper we propose a new definition for available
bandwidth and a novel framework that addresses these issues. We define
probabilistic available bandwidth (PAB) as the largest input rate at which we
can send a traffic flow along a path while achieving, with specified
probability, an output rate that is almost as large as the input rate. PAB is
expressed directly in terms of the measurable output rate and includes
adjustable parameters that allow the user to adapt to different application
requirements. Our probabilistic framework to estimate network-wide
probabilistic available bandwidth is based on packet trains, Bayesian
inference, factor graphs and active sampling. We deploy our tool on the
PlanetLab network and our results show that we can obtain accurate estimates
with a much smaller measurement overhead compared to existing approaches.Comment: Submitted to Computer Network
Simulation technique for available bandwidth estimation
The paper proposes a method for measuring available bandwidth, based on
testing network packets of various sizes (Variable Packet Size method, VPS).
The boundaries of applicability of the model have been found, which are based
on the accuracy of measurements of packet delays, also we have derived a
formula of measuring the upper limit of bandwidth. The computer simulation has
been performed and relationship between the measurement error of available
bandwidth and the number of measurements has been found. Experimental
verification with the use of RIPE Test Box measuring system has shown that the
suggested method has advantages over existing measurement techniques. Pathload
utility has been chosen as an alternative technique of measurement, and to
ensure reliable results statistics by SNMP agent has been withdrawn directly
from the router
Multi-path Probabilistic Available Bandwidth Estimation through Bayesian Active Learning
Knowing the largest rate at which data can be sent on an end-to-end path such
that the egress rate is equal to the ingress rate with high probability can be
very practical when choosing transmission rates in video streaming or selecting
peers in peer-to-peer applications. We introduce probabilistic available
bandwidth, which is defined in terms of ingress rates and egress rates of
traffic on a path, rather than in terms of capacity and utilization of the
constituent links of the path like the standard available bandwidth metric. In
this paper, we describe a distributed algorithm, based on a probabilistic
graphical model and Bayesian active learning, for simultaneously estimating the
probabilistic available bandwidth of multiple paths through a network. Our
procedure exploits the fact that each packet train provides information not
only about the path it traverses, but also about any path that shares a link
with the monitored path. Simulations and PlanetLab experiments indicate that
this process can dramatically reduce the number of probes required to generate
accurate estimates
- …