13 research outputs found
The Rise of Certificate Transparency and Its Implications on the Internet Ecosystem
In this paper, we analyze the evolution of Certificate Transparency (CT) over
time and explore the implications of exposing certificate DNS names from the
perspective of security and privacy. We find that certificates in CT logs have
seen exponential growth. Website support for CT has also constantly increased,
with now 33% of established connections supporting CT. With the increasing
deployment of CT, there are also concerns of information leakage due to all
certificates being visible in CT logs. To understand this threat, we introduce
a CT honeypot and show that data from CT logs is being used to identify targets
for scanning campaigns only minutes after certificate issuance. We present and
evaluate a methodology to learn and validate new subdomains from the vast
number of domains extracted from CT logged certificates.Comment: To be published at ACM IMC 201
A Retrospective Analysis of User Exposure to (Illicit) Cryptocurrency Mining on the Web
In late 2017, a sudden proliferation of malicious JavaScript was reported on
the Web: browser-based mining exploited the CPU time of website visitors to
mine the cryptocurrency Monero. Several studies measured the deployment of such
code and developed defenses. However, previous work did not establish how many
users were really exposed to the identified mining sites and whether there was
a real risk given common user browsing behavior. In this paper, we present a
retroactive analysis to close this research gap. We pool large-scale,
longitudinal data from several vantage points, gathered during the prime time
of illicit cryptomining, to measure the impact on web users. We leverage data
from passive traffic monitoring of university networks and a large European
ISP, with suspected mining sites identified in previous active scans. We
corroborate our results with data from a browser extension with a large user
base that tracks site visits. We also monitor open HTTP proxies and the Tor
network for malicious injection of code. We find that the risk for most Web
users was always very low, much lower than what deployment scans suggested. Any
exposure period was also very brief. However, we also identify a previously
unknown and exploited attack vector on mobile devices
Clusters in the Expanse: Understanding and Unbiasing IPv6 Hitlists
Network measurements are an important tool in understanding the Internet. Due
to the expanse of the IPv6 address space, exhaustive scans as in IPv4 are not
possible for IPv6. In recent years, several studies have proposed the use of
target lists of IPv6 addresses, called IPv6 hitlists.
In this paper, we show that addresses in IPv6 hitlists are heavily clustered.
We present novel techniques that allow IPv6 hitlists to be pushed from quantity
to quality. We perform a longitudinal active measurement study over 6 months,
targeting more than 50 M addresses. We develop a rigorous method to detect
aliased prefixes, which identifies 1.5 % of our prefixes as aliased, pertaining
to about half of our target addresses. Using entropy clustering, we group the
entire hitlist into just 6 distinct addressing schemes. Furthermore, we perform
client measurements by leveraging crowdsourcing.
To encourage reproducibility in network measurement research and to serve as
a starting point for future IPv6 studies, we publish source code, analysis
tools, and data.Comment: See https://ipv6hitlist.github.io for daily IPv6 hitlists, historical
data, and additional analyse
A Long Way to the Top: Significance, Structure, and Stability of Internet Top Lists
A broad range of research areas including Internet measurement, privacy, and
network security rely on lists of target domains to be analysed; researchers
make use of target lists for reasons of necessity or efficiency. The popular
Alexa list of one million domains is a widely used example. Despite their
prevalence in research papers, the soundness of top lists has seldom been
questioned by the community: little is known about the lists' creation,
representativity, potential biases, stability, or overlap between lists.
In this study we survey the extent, nature, and evolution of top lists used
by research communities. We assess the structure and stability of these lists,
and show that rank manipulation is possible for some lists. We also reproduce
the results of several scientific studies to assess the impact of using a top
list at all, which list specifically, and the date of list creation. We find
that (i) top lists generally overestimate results compared to the general
population by a significant margin, often even an order of magnitude, and (ii)
some top lists have surprising change characteristics, causing high day-to-day
fluctuation and leading to result instability. We conclude our paper with
specific recommendations on the use of top lists, and how to interpret results
based on top lists with caution.Comment: To be published at ACM IMC 2018. Web site with live data under:
https://toplists.github.i