4,257 research outputs found
Large scale probabilistic available bandwidth estimation
The common utilization-based definition of available bandwidth and many of
the existing tools to estimate it suffer from several important weaknesses: i)
most tools report a point estimate of average available bandwidth over a
measurement interval and do not provide a confidence interval; ii) the commonly
adopted models used to relate the available bandwidth metric to the measured
data are invalid in almost all practical scenarios; iii) existing tools do not
scale well and are not suited to the task of multi-path estimation in
large-scale networks; iv) almost all tools use ad-hoc techniques to address
measurement noise; and v) tools do not provide enough flexibility in terms of
accuracy, overhead, latency and reliability to adapt to the requirements of
various applications. In this paper we propose a new definition for available
bandwidth and a novel framework that addresses these issues. We define
probabilistic available bandwidth (PAB) as the largest input rate at which we
can send a traffic flow along a path while achieving, with specified
probability, an output rate that is almost as large as the input rate. PAB is
expressed directly in terms of the measurable output rate and includes
adjustable parameters that allow the user to adapt to different application
requirements. Our probabilistic framework to estimate network-wide
probabilistic available bandwidth is based on packet trains, Bayesian
inference, factor graphs and active sampling. We deploy our tool on the
PlanetLab network and our results show that we can obtain accurate estimates
with a much smaller measurement overhead compared to existing approaches.Comment: Submitted to Computer Network
Multi-path Probabilistic Available Bandwidth Estimation through Bayesian Active Learning
Knowing the largest rate at which data can be sent on an end-to-end path such
that the egress rate is equal to the ingress rate with high probability can be
very practical when choosing transmission rates in video streaming or selecting
peers in peer-to-peer applications. We introduce probabilistic available
bandwidth, which is defined in terms of ingress rates and egress rates of
traffic on a path, rather than in terms of capacity and utilization of the
constituent links of the path like the standard available bandwidth metric. In
this paper, we describe a distributed algorithm, based on a probabilistic
graphical model and Bayesian active learning, for simultaneously estimating the
probabilistic available bandwidth of multiple paths through a network. Our
procedure exploits the fact that each packet train provides information not
only about the path it traverses, but also about any path that shares a link
with the monitored path. Simulations and PlanetLab experiments indicate that
this process can dramatically reduce the number of probes required to generate
accurate estimates
A comparison of Poisson and uniform sampling for active measurements
Copyright © 2006 IEEEActive probes of network performance represent samples of the underlying performance of a system. Some effort has gone into considering appropriate sampling patterns for such probes, i.e., there has been significant discussion of the importance of sampling using a Poisson process to avoid biases introduced by synchronization of system and measurements. However, there are unanswered questions about whether Poisson probing has costs in terms of sampling efficiency, and there is some misinformation about what types of inferences are possible with different probe patterns. This paper provides a quantitative comparison of two different sampling methods. This paper also shows that the irregularity in probing patterns is useful not just in avoiding synchronization, but also in determining frequency-domain properties of a system. This paper provides a firm basis for practitioners or researchers for making decisions about the type of sampling they should use in a particular applications, along with methods for the analysis of their outputs.Matthew Rougha
On the correlation of internet packet losses
Copyright © 2008 IEEEIn this paper we analyze more than 100 hours of packet traces from Planet-Lab measurements to study the correlation of Internet packet losses. We first apply statistical tests to identify the correlation timescale of the binary loss data. We find that in half of the traces packet losses are far from independent. More significantly, the correlation timescale of packet losses is correlated with the network load. We then examine the loss runs and the success runs of packets. The loss runs are typically short, regardless of the network load. We find that the success runs in the majority of our traces are also uncorrelated. Furthermore, their correlation timescale also does not depend on the network load. All of these results show that the impact of network load on the correlation of packet losses is nontrivial and that loss runs and success runs are better modeled as being independent than the binary losses themselves. © 2008 IEEE.Hung X. Nguyen and Matthew Rougha
Learned Cardinalities: Estimating Correlated Joins with Deep Learning
We describe a new deep learning approach to cardinality estimation. MSCN is a
multi-set convolutional network, tailored to representing relational query
plans, that employs set semantics to capture query features and true
cardinalities. MSCN builds on sampling-based estimation, addressing its
weaknesses when no sampled tuples qualify a predicate, and in capturing
join-crossing correlations. Our evaluation of MSCN using a real-world dataset
shows that deep learning significantly enhances the quality of cardinality
estimation, which is the core problem in query optimization.Comment: CIDR 2019. https://github.com/andreaskipf/learnedcardinalitie
- …