50 research outputs found
In-Vivo Bytecode Instrumentation for Improving Privacy on Android Smartphones in Uncertain Environments
In this paper we claim that an efficient and readily applicable means to
improve privacy of Android applications is: 1) to perform runtime monitoring by
instrumenting the application bytecode and 2) in-vivo, i.e. directly on the
smartphone. We present a tool chain to do this and present experimental results
showing that this tool chain can run on smartphones in a reasonable amount of
time and with a realistic effort. Our findings also identify challenges to be
addressed before running powerful runtime monitoring and instrumentations
directly on smartphones. We implemented two use-cases leveraging the tool
chain: BetterPermissions, a fine-grained user centric permission policy system
and AdRemover an advertisement remover. Both prototypes improve the privacy of
Android systems thanks to in-vivo bytecode instrumentation.Comment: ISBN: 978-2-87971-111-
Challenges and Outlook in Machine Learning-based Malware Detection for Android
Just like in traditional desktop computing, one of the major security issues in mobile computing
lies in malicious software. Several recent studies have shown that Android, as todayâs most
widespread Operating System, is the target of most of the new families of malware.
Manually analysing an Android application to determine whether it is malicious or not is a time-
consuming process. Furthermore, because of the complexity of analysing an application, this
task can only be conducted by highly-skilledâhence hard to come byâprofessionals.
Researchers naturally sought to transfer this process from humans to computers to lower the
cost of detecting malware. Machine-Learning techniques, looking at patterns amongst known
malware and inferring models of what discriminates malware from goodware, have long been
summoned to build malware detectors.
The vast quantity of data involved in malware detection, added to the fact that we do not know a
priori how to express in technical terms the difference between malware and goodware, indeed
makes the malware detection question a seemingly textbook example of a possible Machine-
Learning application.
Despite the vast amount of literature published on the topic of detecting malware with machine-
learning, malware detection is not a solved problem. In this Thesis, we investigate issues that
affect performance evaluation and that thus may render current machine learning-based mal-
ware detectors for Android hardly usable in practical settings, and we propose an approach to
overcome those issues. While the experiments presented in this thesis all rely on feature-sets
obtained through lightweight static analysis, several of our findings could apply equally to all
Machine Learning-based malware detection approaches.
In the first part of this thesis, background information on machine-learning and on malware
detection is provided, and the related work is described. A snapshot of the malware landscape
in Android application markets is then presented.
The second part discusses three pitfalls hindering the evaluation of malware detectors. We show
with extensive experiments how validation methodology, History-unaware dataset construction
and the choice of a ground truth can heavily interfere with the performance results of malware
detectors.
In a third part, we present an practical approach to detect Android Malware in real-world settings.
We then propose several research paths to get closer to our long term goal of building practical,
dependable and predictable Android Malware detectors
A First Look at Android Applications in Google Play related to Covid-19
Due to the convenience of access-on-demand to information and business
solutions, mobile apps have become an important asset in the digital world. In
the context of the Covid-19 pandemic, app developers have joined the response
effort in various ways by releasing apps that target different user bases
(e.g., all citizens or journalists), offer different services (e.g., location
tracking or diagnostic-aid), provide generic or specialized information, etc.
While many apps have raised some concerns by spreading misinformation or even
malware, the literature does not yet provide a clear landscape of the different
apps that were developed. In this study, we focus on the Android ecosystem and
investigate Covid-related Android apps. In a best-effort scenario, we attempt
to systematically identify all relevant apps and study their characteristics
with the objective to provide a First taxonomy of Covid-related apps,
broadening the relevance beyond the implementation of contact tracing. Overall,
our study yields a number of empirical insights that contribute to enlarge the
knowledge on Covid-related apps: (1) Developer communities contributed rapidly
to the Covid-19, with dedicated apps released as early as January 2020; (2)
Covid-related apps deliver digital tools to users (e.g., health diaries), serve
to broadcast information to users (e.g., spread statistics), and collect data
from users (e.g., for tracing); (3) Covid-related apps are less complex than
standard apps; (4) they generally do not seem to leak sensitive data; (5) in
the majority of cases, Covid-related apps are released by entities with past
experience on the market, mostly official government entities or public health
organizations.Comment: Accepted in Empirical Software Engineering under reference:
EMSE-D-20-00211R
LaFiCMIL: Rethinking Large File Classification from the Perspective of Correlated Multiple Instance Learning
Transformer-based models, such as BERT, have revolutionized various language
tasks, but still struggle with large file classification due to their input
limit (e.g., 512 tokens). Despite several attempts to alleviate this
limitation, no method consistently excels across all benchmark datasets,
primarily because they can only extract partial essential information from the
input file. Additionally, they fail to adapt to the varied properties of
different types of large files. In this work, we tackle this problem from the
perspective of correlated multiple instance learning. The proposed approach,
LaFiCMIL, serves as a versatile framework applicable to various large file
classification tasks covering binary, multi-class, and multi-label
classification tasks, spanning various domains including Natural Language
Processing, Programming Language Processing, and Android Analysis. To evaluate
its effectiveness, we employ eight benchmark datasets pertaining to Long
Document Classification, Code Defect Detection, and Android Malware Detection.
Leveraging BERT-family models as feature extractors, our experimental results
demonstrate that LaFiCMIL achieves new state-of-the-art performance across all
benchmark datasets. This is largely attributable to its capability of scaling
BERT up to nearly 20K tokens, running on a single Tesla V-100 GPU with 32G of
memory.Comment: 12 pages; update results; manuscript revisio
A Pre-Trained BERT Model for Android Applications
The automation of an increasingly large number of software engineering tasks
is becoming possible thanks to Machine Learning (ML). One foundational building
block in the application of ML to software artifacts is the representation of
these artifacts (e.g., source code or executable code) into a form that is
suitable for learning. Many studies have leveraged representation learning,
delegating to ML itself the job of automatically devising suitable
representations. Yet, in the context of Android problems, existing models are
either limited to coarse-grained whole-app level (e.g., apk2vec) or conducted
for one specific downstream task (e.g., smali2vec). Our work is part of a new
line of research that investigates effective, task-agnostic, and fine-grained
universal representations of bytecode to mitigate both of these two
limitations. Such representations aim to capture information relevant to
various low-level downstream tasks (e.g., at the class-level). We are inspired
by the field of Natural Language Processing, where the problem of universal
representation was addressed by building Universal Language Models, such as
BERT, whose goal is to capture abstract semantic information about sentences,
in a way that is reusable for a variety of tasks. We propose DexBERT, a
BERT-like Language Model dedicated to representing chunks of DEX bytecode, the
main binary format used in Android applications. We empirically assess whether
DexBERT is able to model the DEX language and evaluate the suitability of our
model in two distinct class-level software engineering tasks: Malicious Code
Localization and Defect Prediction. We also experiment with strategies to deal
with the problem of catering to apps having vastly different sizes, and we
demonstrate one example of using our technique to investigate what information
is relevant to a given task
A First Look at Android Applications in Google Play related to Covid-19
Due to the convenience of access-on-demand to information and business solutions, mobile apps have become an important asset in the digital world. In the context of the Covid-19 pandemic, app developers have joined the response effort in various ways by releasing apps that target different user bases (e.g., all citizens or journalists), offer different services (e.g., location tracking or diagnostic-aid), provide generic or specialized information, etc. While many apps have raised some concerns by spreading misinformation or even malware, the literature does not yet provide a clear landscape of the different apps that were developed. In this study, we focus on the Android ecosystem and investigate Covid-related Android apps. In a best-effort scenario, we attempt to systematically identify all relevant apps and study their characteristics with the objective to provide a First taxonomy of Covid related apps, broadening the relevance beyond the implementation of contact tracing. Overall, our study yields a number of empirical insights that contribute to enlarge the knowledge on Covid-related apps: (1) Developer communities contributed rapidly to the Covid-19, with dedicated apps released as early as January 2020; (2) Covid-related apps deliver digital tools to users (e.g., health diaries), serve to broadcast information to users (e.g., spread statistics), and collect data from users (e.g., for tracing); (3) Covid-related apps are less complex than standard apps; (4) they generally do not seem to leak sensitive data; (5) in the majority of cases, Covid-related apps are released by entities with past experience on the market, mostly official government entities or public
health organizations
Lessons Learnt on Reproducibility in Machine Learning Based Android Malware Detection
A well-known curse of computer security research is that it often produces systems that, while technically sound, fail operationally. To overcome this curse, the community generally seeks to assess proposed systems under a variety of settings in order to make explicit every potential bias. In this respect, recently, research achievements on machine learning based malware detection are being considered for thorough evaluation by the community. Such an effort of comprehensive evaluation supposes first and foremost the possibility to perform an independent reproduction study in order to sharpen evaluations presented by approachesâ authors. The question Can published approaches actually be reproduced? thus becomes paramount despite the little interest such mundane and practical aspects seem to attract in the malware detection field. In this paper, we attempt a complete reproduction of five Android Malware Detectors from the literature and discuss to what extent they are âreproducibleâ. Notably, we provide insights on the implications around the guesswork that may be required to finalise a working implementation. Finally, we discuss how barriers to reproduction could be lifted, and how the malware detection field would benefit from stronger reproducibility standardsâlike many various fields already have
Android Malware Detection Using BERT
In this paper, we propose two empirical studies to (1) detect
Android malware and (2) classify Android malware into families. We
rst (1) reproduce the results of MalBERT using BERT models learning
with Android application's manifests obtained from 265k applications
(vs. 22k for MalBERT) from the AndroZoo dataset in order to detect
malware. The results of the MalBERT paper are excellent and hard to
believe as a manifest only roughly represents an application, we therefore
try to answer the following questions in this paper. Are the experiments
from MalBERT reproducible? How important are Permissions for mal-
ware detection? Is it possible to keep or improve the results by reducing
the size of the manifests? We then (2) investigate if BERT can be used to
classify Android malware into families. The results show that BERT can
successfully di erentiate malware/goodware with 97% accuracy. Further-
more BERT can classify malware families with 93% accuracy. We also
demonstrate that Android permissions are not what allows BERT to
successfully classify and even that it does not actually need it
Improving privacy on android smartphones through in-vivo bytecode instrumentation
In this paper we claim that a widely applicable and efficient means to fight against malicious mobile Android applications is: 1) to per- form runtime monitoring 2) by instrumenting the application byte- code and 3) in-vivo, i.e. directly on the smartphone. We present a tool chain to do this and present experimental results showing that this tool chain can run on smartphones in a reasonable amount of time and with a realistic effort. Our findings also identify chal- lenges to be addressed before running powerful runtime monitoring and instrumentations directly on smartphones. We implemented two use-cases leveraging the tool chain: FineGPolicy, a fine-grained user centric permission policy system and AdRemover an adver- tisement remover. Both prototypes improve the privacy of Android systems thanks to in-vivo bytecode instrumentation