134 research outputs found
Implementation and Deployment of a Distributed Network Topology Discovery Algorithm
In the past few years, the network measurement community has been interested
in the problem of internet topology discovery using a large number (hundreds or
thousands) of measurement monitors. The standard way to obtain information
about the internet topology is to use the traceroute tool from a small number
of monitors. Recent papers have made the case that increasing the number of
monitors will give a more accurate view of the topology. However, scaling up
the number of monitors is not a trivial process. Duplication of effort close to
the monitors wastes time by reexploring well-known parts of the network, and
close to destinations might appear to be a distributed denial-of-service (DDoS)
attack as the probes converge from a set of sources towards a given
destination. In prior work, authors of this report proposed Doubletree, an
algorithm for cooperative topology discovery, that reduces the load on the
network, i.e., router IP interfaces and end-hosts, while discovering almost as
many nodes and links as standard approaches based on traceroute. This report
presents our open-source and freely downloadable implementation of Doubletree
in a tool we call traceroute@home. We describe the deployment and validation of
traceroute@home on the PlanetLab testbed and we report on the lessons learned
from this experience. We discuss how traceroute@home can be developed further
and discuss ideas for future improvements
Measured impact of crooked traceroute
Data collected using traceroute-based algorithms underpins research into the Internet’s router-level topology, though it is possible to infer false links from this data. One source of false inference is the combination of per-flow load-balancing, in which more than one path is active from a given source to destination, and classic traceroute, which varies the UDP destination port number or ICMP checksum of successive probe packets, which can cause per-flow load-balancers to treat successive packets as distinct flows and forward them along different paths. Consequently, successive probe packets can solicit responses from unconnected routers, leading to the inference of false links. This paper examines the inaccuracies induced from such false inferences, both on macroscopic and ISP topology mapping. We collected macroscopic topology data to 365k destinations, with techniques that both do and do not try to capture load balancing phenomena.We then use alias resolution techniques to infer if a measurement artifact of classic traceroute induces a false router-level link. This technique detected that 2.71% and 0.76% of the links in our UDP and ICMP graphs were falsely inferred due to the presence of load-balancing. We conclude that most per-flow load-balancing does not induce false links when macroscopic topology is inferred using classic traceroute. The effect of false links on ISP topology mapping is possibly much worse, because the degrees of a tier-1 ISP’s routers derived from classic traceroute were inflated by a median factor of 2.9 as compared to those inferred with Paris traceroute
Inferring AS Relationships: Dead End or Lively Beginning?
Recent techniques for inferring business relationships between ASs have
yielded maps that have extremely few invalid BGP paths in the terminology of
Gao. However, some relationships inferred by these newer algorithms are
incorrect, leading to the deduction of unrealistic AS hierarchies. We
investigate this problem and discover what causes it. Having obtained such
insight, we generalize the problem of AS relationship inference as a
multiobjective optimization problem with node-degree-based corrections to the
original objective function of minimizing the number of invalid paths. We solve
the generalized version of the problem using the semidefinite programming
relaxation of the MAX2SAT problem. Keeping the number of invalid paths small,
we obtain a more veracious solution than that yielded by recent heuristics
Evolution of the Internet AS-Level Ecosystem
We present an analytically tractable model of Internet evolution at the level
of Autonomous Systems (ASs). We call our model the multiclass preferential
attachment (MPA) model. As its name suggests, it is based on preferential
attachment. All of its parameters are measurable from available Internet
topology data. Given the estimated values of these parameters, our analytic
results predict a definitive set of statistics characterizing the AS topology
structure. These statistics are not part of the model formulation. The MPA
model thus closes the "measure-model-validate-predict" loop, and provides
further evidence that preferential attachment is a driving force behind
Internet evolution
Hypersparse Neural Network Analysis of Large-Scale Internet Traffic
The Internet is transforming our society, necessitating a quantitative
understanding of Internet traffic. Our team collects and curates the largest
publicly available Internet traffic data containing 50 billion packets.
Utilizing a novel hypersparse neural network analysis of "video" streams of
this traffic using 10,000 processors in the MIT SuperCloud reveals a new
phenomena: the importance of otherwise unseen leaf nodes and isolated links in
Internet traffic. Our neural network approach further shows that a
two-parameter modified Zipf-Mandelbrot distribution accurately describes a wide
variety of source/destination statistics on moving sample windows ranging from
100,000 to 100,000,000 packets over collections that span years and continents.
The inferred model parameters distinguish different network streams and the
model leaf parameter strongly correlates with the fraction of the traffic in
different underlying network topologies. The hypersparse neural network
pipeline is highly adaptable and different network statistics and training
models can be incorporated with simple changes to the image filter functions.Comment: 11 pages, 10 figures, 3 tables, 60 citations; to appear in IEEE High
Performance Extreme Computing (HPEC) 201
- …