58,427 research outputs found
Automated Website Fingerprinting through Deep Learning
Several studies have shown that the network traffic that is generated by a
visit to a website over Tor reveals information specific to the website through
the timing and sizes of network packets. By capturing traffic traces between
users and their Tor entry guard, a network eavesdropper can leverage this
meta-data to reveal which website Tor users are visiting. The success of such
attacks heavily depends on the particular set of traffic features that are used
to construct the fingerprint. Typically, these features are manually engineered
and, as such, any change introduced to the Tor network can render these
carefully constructed features ineffective. In this paper, we show that an
adversary can automate the feature engineering process, and thus automatically
deanonymize Tor traffic by applying our novel method based on deep learning. We
collect a dataset comprised of more than three million network traces, which is
the largest dataset of web traffic ever used for website fingerprinting, and
find that the performance achieved by our deep learning approaches is
comparable to known methods which include various research efforts spanning
over multiple years. The obtained success rate exceeds 96% for a closed world
of 100 websites and 94% for our biggest closed world of 900 classes. In our
open world evaluation, the most performant deep learning model is 2% more
accurate than the state-of-the-art attack. Furthermore, we show that the
implicit features automatically learned by our approach are far more resilient
to dynamic changes of web content over time. We conclude that the ability to
automatically construct the most relevant traffic features and perform accurate
traffic recognition makes our deep learning based approach an efficient,
flexible and robust technique for website fingerprinting.Comment: To appear in the 25th Symposium on Network and Distributed System
Security (NDSS 2018
FraudDroid: Automated Ad Fraud Detection for Android Apps
Although mobile ad frauds have been widespread, state-of-the-art approaches
in the literature have mainly focused on detecting the so-called static
placement frauds, where only a single UI state is involved and can be
identified based on static information such as the size or location of ad
views. Other types of fraud exist that involve multiple UI states and are
performed dynamically while users interact with the app. Such dynamic
interaction frauds, although now widely spread in apps, have not yet been
explored nor addressed in the literature. In this work, we investigate a wide
range of mobile ad frauds to provide a comprehensive taxonomy to the research
community. We then propose, FraudDroid, a novel hybrid approach to detect ad
frauds in mobile Android apps. FraudDroid analyses apps dynamically to build UI
state transition graphs and collects their associated runtime network traffics,
which are then leveraged to check against a set of heuristic-based rules for
identifying ad fraudulent behaviours. We show empirically that FraudDroid
detects ad frauds with a high precision (93%) and recall (92%). Experimental
results further show that FraudDroid is capable of detecting ad frauds across
the spectrum of fraud types. By analysing 12,000 ad-supported Android apps,
FraudDroid identified 335 cases of fraud associated with 20 ad networks that
are further confirmed to be true positive results and are shared with our
fellow researchers to promote advanced ad fraud detectionComment: 12 pages, 10 figure
Automated Discovery of Internet Censorship by Web Crawling
Censorship of the Internet is widespread around the world. As access to the
web becomes increasingly ubiquitous, filtering of this resource becomes more
pervasive. Transparency about specific content that citizens are denied access
to is atypical. To counter this, numerous techniques for maintaining URL filter
lists have been proposed by various individuals and organisations that aim to
empirical data on censorship for benefit of the public and wider censorship
research community.
We present a new approach for discovering filtered domains in different
countries. This method is fully automated and requires no human interaction.
The system uses web crawling techniques to traverse between filtered sites and
implements a robust method for determining if a domain is filtered. We
demonstrate the effectiveness of the approach by running experiments to search
for filtered content in four different censorship regimes. Our results show
that we perform better than the current state of the art and have built domain
filter lists an order of magnitude larger than the most widely available public
lists as of Jan 2018. Further, we build a dataset mapping the interlinking
nature of blocked content between domains and exhibit the tightly networked
nature of censored web resources
Exploring the information behaviour of users of Welsh Newspapers Online through web log analysis
Purpose – Webometric techniques have been applied to many websites and online resources,
especially since the launch of Google Analytics (GA). To date, though, there has been little
consideration of information behaviour in relation to digitised newspaper collections. The purpose of
this paper is to address a perceived gap in the literature by providing an account of user behaviour in
the newly launched Welsh Newspapers Online (WNO).
Design/methodology/approach – The author collected webometric data for WNO using GA and
web server content logs. These were analysed to identify patterns of engagement and user behaviour,
which were then considered in relation to existing information behaviour.
Findings – Use of WNO, while reminiscent of archival information seeking, can be understood as
centring on the web interface rather than the digitised material. In comparison to general web browsing,
users are much more deeply engaged with the resource. This engagement incorporates reading online,
but users’ information seeking utilises website search and browsing functionality rather than filtering in
newspaper material. Information seeking in digitised newspapers resembles the model of the “user” more
closely than that of the “reader”, a value-laden distinction which needs further unpacking.
Research limitations/implications – While the behaviour discussed in this paper is likely to be
more widely representative, a larger longitudinal data set would increase the study’s significance.
Additionally, the methodology of this paper can only tell us what users are doing, and further research
is needed to identify the drivers for this behaviour.
Originality/value – This study provides important insights into the underinvestigated area of
digitised newspaper collections, and shows the importance of webometric methods in analysing online
user behaviour
Third Party Tracking in the Mobile Ecosystem
Third party tracking allows companies to identify users and track their
behaviour across multiple digital services. This paper presents an empirical
study of the prevalence of third-party trackers on 959,000 apps from the US and
UK Google Play stores. We find that most apps contain third party tracking, and
the distribution of trackers is long-tailed with several highly dominant
trackers accounting for a large portion of the coverage. The extent of tracking
also differs between categories of apps; in particular, news apps and apps
targeted at children appear to be amongst the worst in terms of the number of
third party trackers associated with them. Third party tracking is also
revealed to be a highly trans-national phenomenon, with many trackers operating
in jurisdictions outside the EU. Based on these findings, we draw out some
significant legal compliance challenges facing the tracking industry.Comment: Corrected missing company info (Linkedin owned by Microsoft). Figures
for Microsoft and Linkedin re-calculated and added to Table
FilteredWeb: A Framework for the Automated Search-Based Discovery of Blocked URLs
Various methods have been proposed for creating and maintaining lists of
potentially filtered URLs to allow for measurement of ongoing internet
censorship around the world. Whilst testing a known resource for evidence of
filtering can be relatively simple, given appropriate vantage points,
discovering previously unknown filtered web resources remains an open
challenge.
We present a new framework for automating the process of discovering filtered
resources through the use of adaptive queries to well-known search engines. Our
system applies information retrieval algorithms to isolate characteristic
linguistic patterns in known filtered web pages; these are then used as the
basis for web search queries. The results of these queries are then checked for
evidence of filtering, and newly discovered filtered resources are fed back
into the system to detect further filtered content.
Our implementation of this framework, applied to China as a case study, shows
that this approach is demonstrably effective at detecting significant numbers
of previously unknown filtered web pages, making a significant contribution to
the ongoing detection of internet filtering as it develops.
Our tool is currently deployed and has been used to discover 1355 domains
that are poisoned within China as of Feb 2017 - 30 times more than are
contained in the most widely-used public filter list. Of these, 759 are outside
of the Alexa Top 1000 domains list, demonstrating the capability of this
framework to find more obscure filtered content. Further, our initial analysis
of filtered URLs, and the search terms that were used to discover them, gives
further insight into the nature of the content currently being blocked in
China.Comment: To appear in "Network Traffic Measurement and Analysis Conference
2017" (TMA2017
ReCon: Revealing and Controlling PII Leaks in Mobile Network Traffic
It is well known that apps running on mobile devices extensively track and
leak users' personally identifiable information (PII); however, these users
have little visibility into PII leaked through the network traffic generated by
their devices, and have poor control over how, when and where that traffic is
sent and handled by third parties. In this paper, we present the design,
implementation, and evaluation of ReCon: a cross-platform system that reveals
PII leaks and gives users control over them without requiring any special
privileges or custom OSes. ReCon leverages machine learning to reveal potential
PII leaks by inspecting network traffic, and provides a visualization tool to
empower users with the ability to control these leaks via blocking or
substitution of PII. We evaluate ReCon's effectiveness with measurements from
controlled experiments using leaks from the 100 most popular iOS, Android, and
Windows Phone apps, and via an IRB-approved user study with 92 participants. We
show that ReCon is accurate, efficient, and identifies a wider range of PII
than previous approaches.Comment: Please use MobiSys version when referencing this work:
http://dl.acm.org/citation.cfm?id=2906392. 18 pages, recon.meddle.mob
- …