128 research outputs found
A First Look at Ad Blocking Apps on Google Play
Online advertisers and analytics services (or trackers), are constantly
tracking users activities as they access web services either through browsers
or a mobile apps. Numerous tools such as browser plugins and specialized mobile
apps have been proposed to limit intrusive advertisements and prevent tracking
on desktop computing and mobile phones. For desktop computing, browser plugins
are heavily studied for their usability and efficiency issues, however, tools
that block ads and prevent trackers in mobile platforms, have received the
least or no attention.
In this paper, we present a first look at 97 Android adblocking apps (or
adblockers), extracted from more than 1.5 million apps from Google Play, that
promise to block advertisements and analytics services. With our data
collection and analysis pipeline of the Android adblockers, we reveal the
presences of third-party tracking libraries and sensitive permissions for
critical resources on user mobile devices as well as have malware in the source
codes. We analyze users' reviews for the in-effectiveness of adblockers in
terms of not blocking ads and trackers. We found that a significant fraction of
adblockers are not fulfilling their advertised functionality
DaDiDroid: An Obfuscation Resilient Tool for Detecting Android Malware via Weighted Directed Call Graph Modelling
With the number of new mobile malware instances increasing by over 50\%
annually since 2012 [24], malware embedding in mobile apps is arguably one of
the most serious security issues mobile platforms are exposed to. While
obfuscation techniques are successfully used to protect the intellectual
property of apps' developers, they are unfortunately also often used by
cybercriminals to hide malicious content inside mobile apps and to deceive
malware detection tools. As a consequence, most of mobile malware detection
approaches fail in differentiating between benign and obfuscated malicious
apps. We examine the graph features of mobile apps code by building weighted
directed graphs of the API calls, and verify that malicious apps often share
structural similarities that can be used to differentiate them from benign
apps, even under a heavily "polluted" training set where a large majority of
the apps are obfuscated. We present DaDiDroid an Android malware app detection
tool that leverages features of the weighted directed graphs of API calls to
detect the presence of malware code in (obfuscated) Android apps. We show that
DaDiDroid significantly outperforms MaMaDroid [23], a recently proposed malware
detection tool that has been proven very efficient in detecting malware in a
clean non-obfuscated environment. We evaluate DaDiDroid's accuracy and
robustness against several evasion techniques using various datasets for a
total of 43,262 benign and 20,431 malware apps. We show that DaDiDroid
correctly labels up to 96% of Android malware samples, while achieving an 91%
accuracy with an exclusive use of a training set of obfuscated apps.Comment: 9 pages. arXiv admin note: text overlap with arXiv:1801.01633 by
other author
The Web for Under-Powered Mobile Devices: Lessons learned from Google Glass
This paper examines some of the potential challenges associated with enabling
a seamless web experience on underpowered mobile devices such as Google Glass
from the perspective of web content providers, device, and the network. We
conducted experiments to study the impact of webpage complexity, individual web
components and different application layer protocols while accessing webpages
on the performance of Glass browser, by measuring webpage load time,
temperature variation and power consumption and compare it to a smartphone. Our
findings suggest that (a) performance of Glass compared to a smartphone in
terms of power consumption and webpage load time deteriorates with increasing
webpage complexity (b) execution time for popular JavaScript benchmarks is
about 3-8 times higher on Glass compared to a smartphone, (c) WebP is more
energy efficient image format than JPEG and PNG, and (d) seven out of 50
websites studied are optimized for content delivery to Glass
Differentially Private Release of Public Transport Data: The Opal Use Case
This document describes the application of a differentially private algorithm
to release public transport usage data from Transport for New South Wales
(TfNSW), Australia. The data consists of two separate weeks of "tap-on/tap-off"
data of individuals who used any of the four different modes of public
transport from TfNSW: buses, light rail, train and ferries. These taps are
recorded through the smart ticketing system, known as Opal, available in the
state of New South Wales, Australia
Graph Based Recommendations: From Data Representation to Feature Extraction and Application
Modeling users for the purpose of identifying their preferences and then
personalizing services on the basis of these models is a complex task,
primarily due to the need to take into consideration various explicit and
implicit signals, missing or uncertain information, contextual aspects, and
more. In this study, a novel generic approach for uncovering latent preference
patterns from user data is proposed and evaluated. The approach relies on
representing the data using graphs, and then systematically extracting
graph-based features and using them to enrich the original user models. The
extracted features encapsulate complex relationships between users, items, and
metadata. The enhanced user models can then serve as an input to any
recommendation algorithm. The proposed approach is domain-independent
(demonstrated on data from movies, music, and business recommender systems),
and is evaluated using several state-of-the-art machine learning methods, on
different recommendation tasks, and using different evaluation metrics. The
results show a unanimous improvement in the recommendation accuracy across
tasks and domains. In addition, the evaluation provides a deeper analysis
regarding the performance of the approach in special scenarios, including high
sparsity and variability of ratings
On the Privacy of the Opal Data Release: A Response
This document is a response to a report from the University of Melbourne on
the privacy of the Opal dataset release. The Opal dataset was released by
Data61 (CSIRO) in conjunction with the Transport for New South Wales (TfNSW).
The data consists of two separate weeks of "tap-on/tap-off" data of individuals
who used any of the four different modes of public transport from TfNSW: buses,
light rail, train and ferries. These taps are recorded through the smart
ticketing system, known as Opal, available in the state of New South Wales,
Australia
More Flexible Differential Privacy: The Application of Piecewise Mixture Distributions in Query Release
There is an increasing demand to make data "open" to third parties, as data
sharing has great benefits in data-driven decision making. However, with a wide
variety of sensitive data collected, protecting privacy of individuals,
communities and organizations, is an essential factor in making data "open".
The approaches currently adopted by industry in releasing private data are
often ad hoc and prone to a number of attacks, including re-identification
attacks, as they do not provide adequate privacy guarantees. While differential
privacy has attracted significant interest from academia and industry by
providing rigorous and reliable privacy guarantees, the reduced utility and
inflexibility of current differentially private algorithms for data release is
a barrier to their use in real-life. This paper aims to address these two
challenges. First, we propose a novel mechanism to augment the conventional
utility of differential privacy by fusing two Laplace or geometric
distributions together. We derive closed form expressions for entropy, variance
of added noise, and absolute expectation of noise for the proposed piecewise
mixtures. Then the relevant distributions are utilised to theoretically prove
the privacy and accuracy guarantees of the proposed mechanisms. Second, we show
that our proposed mechanisms have greater flexibility, with three parameters to
adjust, giving better utility in bounding noise, and mitigating larger
inaccuracy, in comparison to typical one-parameter differentially private
mechanisms. We then empirically evaluate the performance of piecewise mixture
distributions with extensive simulations and with a real-world dataset for both
linear count queries and histogram queries. The empirical results show an
increase in all utility measures considered, while maintaining privacy, for the
piecewise mixture mechanisms compared to standard Laplace or geometric
mechanisms
Are 140 Characters Enough? A Large-Scale Linkability Study of Tweets
Microblogging is a very popular Internet activity that informs and entertains
great multitudes of people world-wide via quickly and scalably disseminated
terse messages containing all kinds of newsworthy utterances. Even though
microblogging is neither designed nor meant to emphasize privacy, numerous
contributors hide behind pseudonyms and compartmentalize their different
incarnations via multiple accounts within the same, or across multiple,
site(s). Prior work has shown that stylometric analysis is a very powerful tool
capable of linking product or service reviews and blogs that are produced by
the same author when the number of authors is large. In this paper, we explore
linkability of tweets. Our results, based on a very large corpus of tweets,
clearly demonstrate that, at least for relatively active tweeters, linkability
of tweets by the same author is easily attained even when the number of
tweeters is large. We also show that our linkability results hold for a set of
actual Twitter users who tweet from multiple accounts. This has some obvious
privacy implications, both positive and negative
Modelling and Quantifying Membership Information Leakage in Machine Learning
Machine learning models have been shown to be vulnerable to membership
inference attacks, i.e., inferring whether individuals' data have been used for
training models. The lack of understanding about factors contributing success
of these attacks motivates the need for modelling membership information
leakage using information theory and for investigating properties of machine
learning models and training algorithms that can reduce membership information
leakage. We use conditional mutual information leakage to measure the amount of
information leakage from the trained machine learning model about the presence
of an individual in the training dataset. We devise an upper bound for this
measure of information leakage using Kullback--Leibler divergence that is more
amenable to numerical computation. We prove a direct relationship between the
Kullback--Leibler membership information leakage and the probability of success
for a hypothesis-testing adversary examining whether a particular data record
belongs to the training dataset of a machine learning model. We show that the
mutual information leakage is a decreasing function of the training dataset
size and the regularization weight. We also prove that, if the sensitivity of
the machine learning model (defined in terms of the derivatives of the fitness
with respect to model parameters) is high, more membership information is
potentially leaked. This illustrates that complex models, such as deep neural
networks, are more susceptible to membership inference attacks in comparison to
simpler models with fewer degrees of freedom. We show that the amount of the
membership information leakage is reduced by
when using Gaussian
-differentially-private additive noises
Optimized Deployment of Autonomous Drones to Improve User Experience in Cellular Networks
Modern wireless traffic demand pushes Internet Service Providers to develop
effective strategies to improve user experience. Since deploying dense Base
Stations (BSs) is not cost efficient, an alternative is to deploy autonomous
drones to supplement existing BSs. A street graph is adopted to represent the
area of interest. The outdoor User Equipments (UEs) to be served locate near
streets and the 2D projections of drones are restricted to streets to avoid
collision with buildings. We build up a UE density function based on a real
dataset, reflecting the traffic in the area. We study four problems: where to
deploy single drone to cover maximum UEs, where to deploy drones cover
maximum UEs subject to an inner drone distance constraint, where to deploy
drones cover maximum UEs subject to inner drone distance constraint and drones'
battery constraints, and the minimum drones to cover a given percentage of UEs
subject to inner drone distance constraint. We prove that the latter three
problems are NP-hard and propose greedy algorithms with theoretical analysis.
To our best knowledge, this is the first paper to consider the battery
constraints for drone deployments. Extensive simulations have been conducted to
verify the effectiveness of our approaches
- …