38 research outputs found
Web Tracking: Mechanisms, Implications, and Defenses
This articles surveys the existing literature on the methods currently used
by web services to track the user online as well as their purposes,
implications, and possible user's defenses. A significant majority of reviewed
articles and web resources are from years 2012-2014. Privacy seems to be the
Achilles' heel of today's web. Web services make continuous efforts to obtain
as much information as they can about the things we search, the sites we visit,
the people with who we contact, and the products we buy. Tracking is usually
performed for commercial purposes. We present 5 main groups of methods used for
user tracking, which are based on sessions, client storage, client cache,
fingerprinting, or yet other approaches. A special focus is placed on
mechanisms that use web caches, operational caches, and fingerprinting, as they
are usually very rich in terms of using various creative methodologies. We also
show how the users can be identified on the web and associated with their real
names, e-mail addresses, phone numbers, or even street addresses. We show why
tracking is being used and its possible implications for the users (price
discrimination, assessing financial credibility, determining insurance
coverage, government surveillance, and identity theft). For each of the
tracking methods, we present possible defenses. Apart from describing the
methods and tools used for keeping the personal data away from being tracked,
we also present several tools that were used for research purposes - their main
goal is to discover how and by which entity the users are being tracked on
their desktop computers or smartphones, provide this information to the users,
and visualize it in an accessible and easy to follow way. Finally, we present
the currently proposed future approaches to track the user and show that they
can potentially pose significant threats to the users' privacy.Comment: 29 pages, 212 reference
Independent comparison of popular DPI tools for traffic classification
Deep Packet Inspection (DPI) is the state-of-the-art technology for traffic classification. According to the conventional wisdom, DPI is the most accurate classification technique. Consequently, most popular products, either commercial or open-source, rely on some sort of DPI for traffic classification. However, the actual performance of DPI is still unclear to the research community, since the lack of public datasets prevent the comparison and reproducibility of their results. This paper presents a comprehensive comparison of 6 well-known DPI tools, which are commonly used in the traffic classification literature. Our study includes 2 commercial products (PACE and NBAR) and 4 open-source tools (OpenDPI, L7-filter, nDPI, and Libprotoident). We studied their performance in various scenarios (including packet and flow truncation) and at different classification levels (application protocol, application and web service). We carefully built a labeled dataset with more than 750 K flows, which contains traffic from popular applications. We used the Volunteer-Based System (VBS), developed at Aalborg University, to guarantee the correct labeling of the dataset. We released this dataset, including full packet payloads, to the research community. We believe this dataset could become a common benchmark for the comparison and validation of network traffic classifiers. Our results present PACE, a commercial tool, as the most accurate solution. Surprisingly, we find that some open-source tools, such as nDPI and Libprotoident, also achieve very high accuracy.Peer ReviewedPostprint (author’s final draft