Search CORE

128 research outputs found

A First Look at Ad Blocking Apps on Google Play

Author: Ikram Muhammad
Kaafar Mohamed Ali
Publication venue
Publication date: 12/09/2017
Field of study

Online advertisers and analytics services (or trackers), are constantly tracking users activities as they access web services either through browsers or a mobile apps. Numerous tools such as browser plugins and specialized mobile apps have been proposed to limit intrusive advertisements and prevent tracking on desktop computing and mobile phones. For desktop computing, browser plugins are heavily studied for their usability and efficiency issues, however, tools that block ads and prevent trackers in mobile platforms, have received the least or no attention. In this paper, we present a first look at 97 Android adblocking apps (or adblockers), extracted from more than 1.5 million apps from Google Play, that promise to block advertisements and analytics services. With our data collection and analysis pipeline of the Android adblockers, we reveal the presences of third-party tracking libraries and sensitive permissions for critical resources on user mobile devices as well as have malware in the source codes. We analyze users' reviews for the in-effectiveness of adblockers in terms of not blocking ads and trackers. We found that a significant fraction of adblockers are not fulfilling their advertised functionality

arXiv.org e-Print Archive

DaDiDroid: An Obfuscation Resilient Tool for Detecting Android Malware via Weighted Directed Call Graph Modelling

Author: Beaume Pierrick
Ikram Muhammad
Kaafar Mohamed Ali
Publication venue
Publication date: 21/08/2019
Field of study

With the number of new mobile malware instances increasing by over 50\% annually since 2012 [24], malware embedding in mobile apps is arguably one of the most serious security issues mobile platforms are exposed to. While obfuscation techniques are successfully used to protect the intellectual property of apps' developers, they are unfortunately also often used by cybercriminals to hide malicious content inside mobile apps and to deceive malware detection tools. As a consequence, most of mobile malware detection approaches fail in differentiating between benign and obfuscated malicious apps. We examine the graph features of mobile apps code by building weighted directed graphs of the API calls, and verify that malicious apps often share structural similarities that can be used to differentiate them from benign apps, even under a heavily "polluted" training set where a large majority of the apps are obfuscated. We present DaDiDroid an Android malware app detection tool that leverages features of the weighted directed graphs of API calls to detect the presence of malware code in (obfuscated) Android apps. We show that DaDiDroid significantly outperforms MaMaDroid [23], a recently proposed malware detection tool that has been proven very efficient in detecting malware in a clean non-obfuscated environment. We evaluate DaDiDroid's accuracy and robustness against several evasion techniques using various datasets for a total of 43,262 benign and 20,431 malware apps. We show that DaDiDroid correctly labels up to 96% of Android malware samples, while achieving an 91% accuracy with an exclusive use of a training set of obfuscated apps.Comment: 9 pages. arXiv admin note: text overlap with arXiv:1801.01633 by other author

arXiv.org e-Print Archive

The Web for Under-Powered Mobile Devices: Lessons learned from Google Glass

Author: Chauhan Jagmohan
Kaafar Mohamed Ali
Mahanti Anirban
Publication venue
Publication date: 27/11/2015
Field of study

This paper examines some of the potential challenges associated with enabling a seamless web experience on underpowered mobile devices such as Google Glass from the perspective of web content providers, device, and the network. We conducted experiments to study the impact of webpage complexity, individual web components and different application layer protocols while accessing webpages on the performance of Glass browser, by measuring webpage load time, temperature variation and power consumption and compare it to a smartphone. Our findings suggest that (a) performance of Glass compared to a smartphone in terms of power consumption and webpage load time deteriorates with increasing webpage complexity (b) execution time for popular JavaScript benchmarks is about 3-8 times higher on Glass compared to a smartphone, (c) WebP is more energy efficient image format than JPEG and PNG, and (d) seven out of 50 websites studied are optimized for content delivery to Glass

arXiv.org e-Print Archive

Differentially Private Release of Public Transport Data: The Opal Use Case

Author: Asghar Hassan Jameel
Kaafar Mohamed Ali
Tyler Paul
Publication venue
Publication date: 16/05/2017
Field of study

This document describes the application of a differentially private algorithm to release public transport usage data from Transport for New South Wales (TfNSW), Australia. The data consists of two separate weeks of "tap-on/tap-off" data of individuals who used any of the four different modes of public transport from TfNSW: buses, light rail, train and ferries. These taps are recorded through the smart ticketing system, known as Opal, available in the state of New South Wales, Australia

arXiv.org e-Print Archive

Graph Based Recommendations: From Data Representation to Feature Extraction and Application

Author: Berkovsky Shlomo
Kaafar Mohamed Ali
Kuflik Tsvi
Tiroshi Amit
Publication venue
Publication date: 05/07/2017
Field of study

Modeling users for the purpose of identifying their preferences and then personalizing services on the basis of these models is a complex task, primarily due to the need to take into consideration various explicit and implicit signals, missing or uncertain information, contextual aspects, and more. In this study, a novel generic approach for uncovering latent preference patterns from user data is proposed and evaluated. The approach relies on representing the data using graphs, and then systematically extracting graph-based features and using them to enrich the original user models. The extracted features encapsulate complex relationships between users, items, and metadata. The enhanced user models can then serve as an input to any recommendation algorithm. The proposed approach is domain-independent (demonstrated on data from movies, music, and business recommender systems), and is evaluated using several state-of-the-art machine learning methods, on different recommendation tasks, and using different evaluation metrics. The results show a unanimous improvement in the recommendation accuracy across tasks and domains. In addition, the evaluation provides a deeper analysis regarding the performance of the approach in special scenarios, including high sparsity and variability of ratings

arXiv.org e-Print Archive

On the Privacy of the Opal Data Release: A Response

Author: Asghar Hassan Jameel
Kaafar Mohamed Ali
Tyler Paul
Publication venue
Publication date: 24/05/2017
Field of study

This document is a response to a report from the University of Melbourne on the privacy of the Opal dataset release. The Opal dataset was released by Data61 (CSIRO) in conjunction with the Transport for New South Wales (TfNSW). The data consists of two separate weeks of "tap-on/tap-off" data of individuals who used any of the four different modes of public transport from TfNSW: buses, light rail, train and ferries. These taps are recorded through the smart ticketing system, known as Opal, available in the state of New South Wales, Australia

arXiv.org e-Print Archive

More Flexible Differential Privacy: The Application of Piecewise Mixture Distributions in Query Release

Author: Kaafar Mohamed Ali
Smith David B.
Thilakarathna Kanchana
Publication venue
Publication date: 18/07/2017
Field of study

There is an increasing demand to make data "open" to third parties, as data sharing has great benefits in data-driven decision making. However, with a wide variety of sensitive data collected, protecting privacy of individuals, communities and organizations, is an essential factor in making data "open". The approaches currently adopted by industry in releasing private data are often ad hoc and prone to a number of attacks, including re-identification attacks, as they do not provide adequate privacy guarantees. While differential privacy has attracted significant interest from academia and industry by providing rigorous and reliable privacy guarantees, the reduced utility and inflexibility of current differentially private algorithms for data release is a barrier to their use in real-life. This paper aims to address these two challenges. First, we propose a novel mechanism to augment the conventional utility of differential privacy by fusing two Laplace or geometric distributions together. We derive closed form expressions for entropy, variance of added noise, and absolute expectation of noise for the proposed piecewise mixtures. Then the relevant distributions are utilised to theoretically prove the privacy and accuracy guarantees of the proposed mechanisms. Second, we show that our proposed mechanisms have greater flexibility, with three parameters to adjust, giving better utility in bounding noise, and mitigating larger inaccuracy, in comparison to typical one-parameter differentially private mechanisms. We then empirically evaluate the performance of piecewise mixture distributions with extensive simulations and with a real-world dataset for both linear count queries and histogram queries. The empirical results show an increase in all utility measures considered, while maintaining privacy, for the piecewise mixture mechanisms compared to standard Laplace or geometric mechanisms

arXiv.org e-Print Archive

Are 140 Characters Enough? A Large-Scale Linkability Study of Tweets

Author: Almishari Mishari
Kaafar Mohamed Ali
Oguz Ekin
Tsudik Gene
Publication venue
Publication date: 08/09/2014
Field of study

Microblogging is a very popular Internet activity that informs and entertains great multitudes of people world-wide via quickly and scalably disseminated terse messages containing all kinds of newsworthy utterances. Even though microblogging is neither designed nor meant to emphasize privacy, numerous contributors hide behind pseudonyms and compartmentalize their different incarnations via multiple accounts within the same, or across multiple, site(s). Prior work has shown that stylometric analysis is a very powerful tool capable of linking product or service reviews and blogs that are produced by the same author when the number of authors is large. In this paper, we explore linkability of tweets. Our results, based on a very large corpus of tweets, clearly demonstrate that, at least for relatively active tweeters, linkability of tweets by the same author is easily attained even when the number of tweeters is large. We also show that our linkability results hold for a set of actual Twitter users who tweet from multiple accounts. This has some obvious privacy implications, both positive and negative

arXiv.org e-Print Archive

Modelling and Quantifying Membership Information Leakage in Machine Learning

Author: Farokhi Farhad
Kaafar Mohamed Ali
Publication venue
Publication date: 27/04/2020
Field of study

Machine learning models have been shown to be vulnerable to membership inference attacks, i.e., inferring whether individuals' data have been used for training models. The lack of understanding about factors contributing success of these attacks motivates the need for modelling membership information leakage using information theory and for investigating properties of machine learning models and training algorithms that can reduce membership information leakage. We use conditional mutual information leakage to measure the amount of information leakage from the trained machine learning model about the presence of an individual in the training dataset. We devise an upper bound for this measure of information leakage using Kullback--Leibler divergence that is more amenable to numerical computation. We prove a direct relationship between the Kullback--Leibler membership information leakage and the probability of success for a hypothesis-testing adversary examining whether a particular data record belongs to the training dataset of a machine learning model. We show that the mutual information leakage is a decreasing function of the training dataset size and the regularization weight. We also prove that, if the sensitivity of the machine learning model (defined in terms of the derivatives of the fitness with respect to model parameters) is high, more membership information is potentially leaked. This illustrates that complex models, such as deep neural networks, are more susceptible to membership inference attacks in comparison to simpler models with fewer degrees of freedom. We show that the amount of the membership information leakage is reduced by

\mathcal{O}(\log^{1/2}(\delta^{-1})\epsilon^{-1})

when using Gaussian

(\epsilon,\delta)

-differentially-private additive noises

arXiv.org e-Print Archive

Optimized Deployment of Autonomous Drones to Improve User Experience in Cellular Networks

Author: Ding Ming
Huang Hailong
Kaafar Mohamed Ali
Savkin Andrey V.
Publication venue
Publication date: 06/12/2017
Field of study

Modern wireless traffic demand pushes Internet Service Providers to develop effective strategies to improve user experience. Since deploying dense Base Stations (BSs) is not cost efficient, an alternative is to deploy autonomous drones to supplement existing BSs. A street graph is adopted to represent the area of interest. The outdoor User Equipments (UEs) to be served locate near streets and the 2D projections of drones are restricted to streets to avoid collision with buildings. We build up a UE density function based on a real dataset, reflecting the traffic in the area. We study four problems: where to deploy single drone to cover maximum UEs, where to deploy

k

drones cover maximum UEs subject to an inner drone distance constraint, where to deploy

k

drones cover maximum UEs subject to inner drone distance constraint and drones' battery constraints, and the minimum drones to cover a given percentage of UEs subject to inner drone distance constraint. We prove that the latter three problems are NP-hard and propose greedy algorithms with theoretical analysis. To our best knowledge, this is the first paper to consider the battery constraints for drone deployments. Extensive simulations have been conducted to verify the effectiveness of our approaches

arXiv.org e-Print Archive