7,723 research outputs found
Self-Learning Classifier for Internet traffic
Network visibility is a critical part of traffic engineering, network management, and security. Recently, unsupervised algorithms have been envisioned as a viable alternative to automatically identify classes of traffic. However, the accuracy achieved so far does not allow to use them for traffic classification in practical scenario. In this paper, we propose SeLeCT, a Self-Learning Classifier for Internet traffic. It uses unsupervised algorithms along with an adaptive learning approach to automatically let classes of traffic emerge, being identified and (easily) labeled. SeLeCT automatically groups flows into pure (or homogeneous) clusters using alternating simple clustering and filtering phases to remove outliers. SeLeCT uses an adaptive learning approach to boost its ability to spot new protocols and applications. Finally, SeLeCT also simplifies label assignment (which is still based on some manual intervention) so that proper class labels can be easily discovered. We evaluate the performance of SeLeCT using traffic traces collected in different years from various ISPs located in 3 different continents. Our experiments show that SeLeCT achieves overall accuracy close to 98%. Unlike state-of-art classifiers, the biggest advantage of SeLeCT is its ability to help discovering new protocols and applications in an almost automated fashio
Entropy/IP: Uncovering Structure in IPv6 Addresses
In this paper, we introduce Entropy/IP: a system that discovers Internet
address structure based on analyses of a subset of IPv6 addresses known to be
active, i.e., training data, gleaned by readily available passive and active
means. The system is completely automated and employs a combination of
information-theoretic and machine learning techniques to probabilistically
model IPv6 addresses. We present results showing that our system is effective
in exposing structural characteristics of portions of the IPv6 Internet address
space populated by active client, service, and router addresses.
In addition to visualizing the address structure for exploration, the system
uses its models to generate candidate target addresses for scanning. For each
of 15 evaluated datasets, we train on 1K addresses and generate 1M candidates
for scanning. We achieve some success in 14 datasets, finding up to 40% of the
generated addresses to be active. In 11 of these datasets, we find active
network identifiers (e.g., /64 prefixes or `subnets') not seen in training.
Thus, we provide the first evidence that it is practical to discover subnets
and hosts by scanning probabilistically selected areas of the IPv6 address
space not known to contain active hosts a priori.Comment: Paper presented at the ACM IMC 2016 in Santa Monica, USA
(https://dl.acm.org/citation.cfm?id=2987445). Live Demo site available at
http://www.entropy-ip.com
How Do Tor Users Interact With Onion Services?
Onion services are anonymous network services that are exposed over the Tor
network. In contrast to conventional Internet services, onion services are
private, generally not indexed by search engines, and use self-certifying
domain names that are long and difficult for humans to read. In this paper, we
study how people perceive, understand, and use onion services based on data
from 17 semi-structured interviews and an online survey of 517 users. We find
that users have an incomplete mental model of onion services, use these
services for anonymity and have varying trust in onion services in general.
Users also have difficulty discovering and tracking onion sites and
authenticating them. Finally, users want technical improvements to onion
services and better information on how to use them. Our findings suggest
various improvements for the security and usability of Tor onion services,
including ways to automatically detect phishing of onion services, more clear
security indicators, and ways to manage onion domain names that are difficult
to remember.Comment: Appeared in USENIX Security Symposium 201
The Making of Cloud Applications An Empirical Study on Software Development for the Cloud
Cloud computing is gaining more and more traction as a deployment and
provisioning model for software. While a large body of research already covers
how to optimally operate a cloud system, we still lack insights into how
professional software engineers actually use clouds, and how the cloud impacts
development practices. This paper reports on the first systematic study on how
software developers build applications in the cloud. We conducted a
mixed-method study, consisting of qualitative interviews of 25 professional
developers and a quantitative survey with 294 responses. Our results show that
adopting the cloud has a profound impact throughout the software development
process, as well as on how developers utilize tools and data in their daily
work. Among other things, we found that (1) developers need better means to
anticipate runtime problems and rigorously define metrics for improved fault
localization and (2) the cloud offers an abundance of operational data,
however, developers still often rely on their experience and intuition rather
than utilizing metrics. From our findings, we extracted a set of guidelines for
cloud development and identified challenges for researchers and tool vendors
The Leap Second Behaviour of NTP Servers
The NTP network is an important part of the
Internet’s infrastructure, and one of the most challenging times
for the NTP network is around leap seconds. In this paper we look
at the behaviour of public servers in the NTP network in 2005 and
over the period from 2008 to present, focusing on leap seconds.
We review the evolution of the NTP reference implementation
with respect to leap seconds and show how the behaviour of the
network has changed since 2005. Our results show that although
the network’s performance has certain problems, these seem to
be reducing over time
Recommended from our members
Discovering Network Control Vulnerabilities and Policies in Evolving Networks
The range and number of new applications and services are growing at an unprecedented rate. Computer networks need to be able to provide connectivity for these services and meet their constantly changing demands. This requires not only support of new network protocols and security requirements, but often architectural redesigns for long-term improvements to efficiency, speed, throughput, cost, and security. Networks are now facing a drastic increase in size and are required to carry a constantly growing amount of heterogeneous traffic. Unfortunately such dynamism greatly complicates security of not only the end nodes in the network, but also of the nodes of the network itself. To make matters worse, just as applications are being developed at faster and faster rates, attacks are becoming more pervasive and complex. Networks need to be able to understand the impact of these attacks and protect against them.
Network control devices, such as routers, firewalls, censorship devices, and base stations, are elements of the network that make decisions on how traffic is handled. Although network control devices are expected to act according to specifications, there can be various reasons why they do not in practice. Protocols could be flawed, ambiguous or incomplete, developers could introduce unintended bugs, or attackers may find vulnerabilities in the devices and exploit them. Malfunction could intentionally or unintentionally threaten the confidentiality, integrity, and availability of end nodes and the data that passes through the network. It can also impact the availability and performance of the control devices themselves and the security policies of the network. The fast-paced evolution and scalability of current and future networks create a dynamic environment for which it is difficult to develop automated tools for testing new protocols and components. At the same time, they make the function of such tools vital for discovering implementation flaws and protocol vulnerabilities as networks become larger and more complex, and as new and potentially unrefined architectures become adopted. This thesis will present the design, implementation, and evaluation of a set of tools designed for understanding implementation of network control nodes and how they react to changes in traffic characteristics as networks evolve. We will first introduce Firecycle, a test bed for analyzing the impact of large-scale attacks and Machine-to-Machine (M2M) traffic on the Long Term Evolution (LTE) network. We will then discuss Autosonda, a tool for automatically discovering rule implementation and finding triggering traffic features in censorship devices.
This thesis provides the following contributions:
1. The design, implementation, and evaluation of two tools to discover models of network control nodes in two scenarios of evolving networks, mobile network and censored internet
2. First existing test bed for analysis of large-scale attacks and impact of traffic scalability on LTE mobile networks
3. First existing test bed for LTE networks that can be scaled to arbitrary size and that deploys traffic models based on real traffic traces taken from a tier-1 operator
4. An analysis of traffic models of various categories of Internet of Things (IoT) devices
5. First study demonstrating the impact of M2M scalability and signaling overload on the packet core of LTE mobile networks
6. A specification for modeling of censorship device decision models
7. A means for automating the discovery of features utilized in censorship device decision models, comparison of these models, and their rule discover
- …