11 research outputs found
Temporal and Spatial Classification of Active IPv6 Addresses
There is striking volume of World-Wide Web activity on IPv6 today. In early
2015, one large Content Distribution Network handles 50 billion IPv6 requests
per day from hundreds of millions of IPv6 client addresses; billions of unique
client addresses are observed per month. Address counts, however, obscure the
number of hosts with IPv6 connectivity to the global Internet. There are
numerous address assignment and subnetting options in use; privacy addresses
and dynamic subnet pools significantly inflate the number of active IPv6
addresses. As the IPv6 address space is vast, it is infeasible to
comprehensively probe every possible unicast IPv6 address. Thus, to survey the
characteristics of IPv6 addressing, we perform a year-long passive measurement
study, analyzing the IPv6 addresses gleaned from activity logs for all clients
accessing a global CDN.
The goal of our work is to develop flexible classification and measurement
methods for IPv6, motivated by the fact that its addresses are not merely more
numerous; they are different in kind. We introduce the notion of classifying
addresses and prefixes in two ways: (1) temporally, according to their
instances of activity to discern which addresses can be considered stable; (2)
spatially, according to the density or sparsity of aggregates in which active
addresses reside. We present measurement and classification results numerically
and visually that: provide details on IPv6 address use and structure in global
operation across the past year; establish the efficacy of our classification
methods; and demonstrate that such classification can clarify dimensions of the
Internet that otherwise appear quite blurred by current IPv6 addressing
practices
Entropy/IP: Uncovering Structure in IPv6 Addresses
In this paper, we introduce Entropy/IP: a system that discovers Internet
address structure based on analyses of a subset of IPv6 addresses known to be
active, i.e., training data, gleaned by readily available passive and active
means. The system is completely automated and employs a combination of
information-theoretic and machine learning techniques to probabilistically
model IPv6 addresses. We present results showing that our system is effective
in exposing structural characteristics of portions of the IPv6 Internet address
space populated by active client, service, and router addresses.
In addition to visualizing the address structure for exploration, the system
uses its models to generate candidate target addresses for scanning. For each
of 15 evaluated datasets, we train on 1K addresses and generate 1M candidates
for scanning. We achieve some success in 14 datasets, finding up to 40% of the
generated addresses to be active. In 11 of these datasets, we find active
network identifiers (e.g., /64 prefixes or `subnets') not seen in training.
Thus, we provide the first evidence that it is practical to discover subnets
and hosts by scanning probabilistically selected areas of the IPv6 address
space not known to contain active hosts a priori.Comment: Paper presented at the ACM IMC 2016 in Santa Monica, USA
(https://dl.acm.org/citation.cfm?id=2987445). Live Demo site available at
http://www.entropy-ip.com
Beyond Counting: New Perspectives on the Active IPv4 Address Space
In this study, we report on techniques and analyses that enable us to capture
Internet-wide activity at individual IP address-level granularity by relying on
server logs of a large commercial content delivery network (CDN) that serves
close to 3 trillion HTTP requests on a daily basis. Across the whole of 2015,
these logs recorded client activity involving 1.2 billion unique IPv4
addresses, the highest ever measured, in agreement with recent estimates.
Monthly client IPv4 address counts showed constant growth for years prior, but
since 2014, the IPv4 count has stagnated while IPv6 counts have grown. Thus, it
seems we have entered an era marked by increased complexity, one in which the
sole enumeration of active IPv4 addresses is of little use to characterize
recent growth of the Internet as a whole.
With this observation in mind, we consider new points of view in the study of
global IPv4 address activity. Our analysis shows significant churn in active
IPv4 addresses: the set of active IPv4 addresses varies by as much as 25% over
the course of a year. Second, by looking across the active addresses in a
prefix, we are able to identify and attribute activity patterns to network
restructurings, user behaviors, and, in particular, various address assignment
practices. Third, by combining spatio-temporal measures of address utilization
with measures of traffic volume, and sampling-based estimates of relative host
counts, we present novel perspectives on worldwide IPv4 address activity,
including empirical observation of under-utilization in some areas, and
complete utilization, or exhaustion, in others.Comment: in Proceedings of ACM IMC 201
Discovering the IPv6 Network Periphery
We consider the problem of discovering the IPv6 network periphery, i.e., the
last hop router connecting endhosts in the IPv6 Internet. Finding the IPv6
periphery using active probing is challenging due to the IPv6 address space
size, wide variety of provider addressing and subnetting schemes, and
incomplete topology traces. As such, existing topology mapping systems can miss
the large footprint of the IPv6 periphery, disadvantaging applications ranging
from IPv6 census studies to geolocation and network resilience. We introduce
"edgy," an approach to explicitly discover the IPv6 network periphery, and use
it to find >~64M IPv6 periphery router addresses and >~87M links to these last
hops -- several orders of magnitude more than in currently available IPv6
topologies. Further, only 0.2% of edgy's discovered addresses are known to
existing IPv6 hitlists
In the IP of the Beholder: Strategies for Active IPv6 Topology Discovery
Existing methods for active topology discovery within the IPv6 Internet
largely mirror those of IPv4. In light of the large and sparsely populated
address space, in conjunction with aggressive ICMPv6 rate limiting by routers,
this work develops a different approach to Internet-wide IPv6 topology mapping.
We adopt randomized probing techniques in order to distribute probing load,
minimize the effects of rate limiting, and probe at higher rates. Second, we
extensively analyze the efficiency and efficacy of various IPv6 hitlists and
target generation methods when used for topology discovery, and synthesize new
target lists based on our empirical results to provide both breadth (coverage
across networks) and depth (to find potential subnetting). Employing our
probing strategy, we discover more than 1.3M IPv6 router interface addresses
from a single vantage point. Finally, we share our prober implementation,
synthesized target lists, and discovered IPv6 topology results