2,404 research outputs found
Hiding in Plain Sight: A Longitudinal Study of Combosquatting Abuse
Domain squatting is a common adversarial practice where attackers register
domain names that are purposefully similar to popular domains. In this work, we
study a specific type of domain squatting called "combosquatting," in which
attackers register domains that combine a popular trademark with one or more
phrases (e.g., betterfacebook[.]com, youtube-live[.]com). We perform the first
large-scale, empirical study of combosquatting by analyzing more than 468
billion DNS records---collected from passive and active DNS data sources over
almost six years. We find that almost 60% of abusive combosquatting domains
live for more than 1,000 days, and even worse, we observe increased activity
associated with combosquatting year over year. Moreover, we show that
combosquatting is used to perform a spectrum of different types of abuse
including phishing, social engineering, affiliate abuse, trademark abuse, and
even advanced persistent threats. Our results suggest that combosquatting is a
real problem that requires increased scrutiny by the security community.Comment: ACM CCS 1
The RIPE NCC internet measurement data repository
This paper describes datasets that will shortly be made available to the research community through an Internet measurement data repository operated by the RIPE NCC. The datasets include measurements collected by RIPE NCC projects, packet trace sets recovered from the defunct NLANR website and datasets collected and currently hosted by other research institutions. This work aims to raise awareness of these datasets amongst researchers and to promote discussion about possible changes to the data collection processes to ensure that the measurements are relevant and useful to the community
Entropy/IP: Uncovering Structure in IPv6 Addresses
In this paper, we introduce Entropy/IP: a system that discovers Internet
address structure based on analyses of a subset of IPv6 addresses known to be
active, i.e., training data, gleaned by readily available passive and active
means. The system is completely automated and employs a combination of
information-theoretic and machine learning techniques to probabilistically
model IPv6 addresses. We present results showing that our system is effective
in exposing structural characteristics of portions of the IPv6 Internet address
space populated by active client, service, and router addresses.
In addition to visualizing the address structure for exploration, the system
uses its models to generate candidate target addresses for scanning. For each
of 15 evaluated datasets, we train on 1K addresses and generate 1M candidates
for scanning. We achieve some success in 14 datasets, finding up to 40% of the
generated addresses to be active. In 11 of these datasets, we find active
network identifiers (e.g., /64 prefixes or `subnets') not seen in training.
Thus, we provide the first evidence that it is practical to discover subnets
and hosts by scanning probabilistically selected areas of the IPv6 address
space not known to contain active hosts a priori.Comment: Paper presented at the ACM IMC 2016 in Santa Monica, USA
(https://dl.acm.org/citation.cfm?id=2987445). Live Demo site available at
http://www.entropy-ip.com
HLOC: Hints-Based Geolocation Leveraging Multiple Measurement Frameworks
Geographically locating an IP address is of interest for many purposes. There
are two major ways to obtain the location of an IP address: querying commercial
databases or conducting latency measurements. For structural Internet nodes,
such as routers, commercial databases are limited by low accuracy, while
current measurement-based approaches overwhelm users with setup overhead and
scalability issues. In this work we present our system HLOC, aiming to combine
the ease of database use with the accuracy of latency measurements. We evaluate
HLOC on a comprehensive router data set of 1.4M IPv4 and 183k IPv6 routers.
HLOC first extracts location hints from rDNS names, and then conducts
multi-tier latency measurements. Configuration complexity is minimized by using
publicly available large-scale measurement frameworks such as RIPE Atlas. Using
this measurement, we can confirm or disprove the location hints found in domain
names. We publicly release HLOC's ready-to-use source code, enabling
researchers to easily increase geolocation accuracy with minimum overhead.Comment: As published in TMA'17 conference:
http://tma.ifip.org/main-conference
Verifying and Monitoring IoTs Network Behavior using MUD Profiles
IoT devices are increasingly being implicated in cyber-attacks, raising
community concern about the risks they pose to critical infrastructure,
corporations, and citizens. In order to reduce this risk, the IETF is pushing
IoT vendors to develop formal specifications of the intended purpose of their
IoT devices, in the form of a Manufacturer Usage Description (MUD), so that
their network behavior in any operating environment can be locked down and
verified rigorously. This paper aims to assist IoT manufacturers in developing
and verifying MUD profiles, while also helping adopters of these devices to
ensure they are compatible with their organizational policies and track devices
network behavior based on their MUD profile. Our first contribution is to
develop a tool that takes the traffic trace of an arbitrary IoT device as input
and automatically generates the MUD profile for it. We contribute our tool as
open source, apply it to 28 consumer IoT devices, and highlight insights and
challenges encountered in the process. Our second contribution is to apply a
formal semantic framework that not only validates a given MUD profile for
consistency, but also checks its compatibility with a given organizational
policy. We apply our framework to representative organizations and selected
devices, to demonstrate how MUD can reduce the effort needed for IoT acceptance
testing. Finally, we show how operators can dynamically identify IoT devices
using known MUD profiles and monitor their behavioral changes on their network.Comment: 17 pages, 17 figures. arXiv admin note: text overlap with
arXiv:1804.0435
DNS to the rescue: Discerning Content and Services in a Tangled Web
A careful perusal of the Internet evolution reveals two major trends - explosion of cloud-based services and video stream- ing applications. In both of the above cases, the owner (e.g., CNN, YouTube, or Zynga) of the content and the organiza- tion serving it (e.g., Akamai, Limelight, or Amazon EC2) are decoupled, thus making it harder to understand the asso- ciation between the content, owner, and the host where the content resides. This has created a tangled world wide web that is very hard to unwind, impairing ISPs' and network ad- ministrators' capabilities to control the traffic flowing on the network. In this paper, we present DN-Hunter, a system that lever- ages the information provided by DNS traffic to discern the tangle. Parsing through DNS queries, DN-Hunter tags traffic flows with the associated domain name. This association has several applications and reveals a large amount of useful in- formation: (i) Provides a fine-grained traffic visibility even when the traffic is encrypted (i.e., TLS/SSL flows), thus en- abling more effective policy controls, (ii) Identifies flows even before the flows begin, thus providing superior net- work management capabilities to administrators, (iii) Un- derstand and track (over time) different CDNs and cloud providers that host content for a particular resource, (iv) Discern all the services/content hosted by a given CDN or cloud provider in a particular geography and time, and (v) Provides insights into all applications/services running on any given layer-4 port number. We conduct extensive experimental analysis and show that the results from real traffic traces, ranging from FTTH to 4G ISPs, that support our hypothesis. Simply put, the informa- tion provided by DNS traffic is one of the key components required to unveil the tangled web, and bring the capabilities of controlling the traffic back to the network carrier
- ā¦