31 research outputs found
Evaluation Methodologies in Software Protection Research
Man-at-the-end (MATE) attackers have full control over the system on which
the attacked software runs, and try to break the confidentiality or integrity
of assets embedded in the software. Both companies and malware authors want to
prevent such attacks. This has driven an arms race between attackers and
defenders, resulting in a plethora of different protection and analysis
methods. However, it remains difficult to measure the strength of protections
because MATE attackers can reach their goals in many different ways and a
universally accepted evaluation methodology does not exist. This survey
systematically reviews the evaluation methodologies of papers on obfuscation, a
major class of protections against MATE attacks. For 572 papers, we collected
113 aspects of their evaluation methodologies, ranging from sample set types
and sizes, over sample treatment, to performed measurements. We provide
detailed insights into how the academic state of the art evaluates both the
protections and analyses thereon. In summary, there is a clear need for better
evaluation methodologies. We identify nine challenges for software protection
evaluations, which represent threats to the validity, reproducibility, and
interpretation of research results in the context of MATE attacks
Towards Fine-Grained Localization of Privacy Behaviors
Mobile applications are required to give privacy notices to users when they
collect or share personal information. Creating consistent and concise privacy
notices can be a challenging task for developers. Previous work has attempted
to help developers create privacy notices through a questionnaire or predefined
templates. In this paper, we propose a novel approach and a framework, called
PriGen, that extends these prior work. PriGen uses static analysis to identify
Android applications' code segments that process sensitive information (i.e.
permission-requiring code segments) and then leverages a Neural Machine
Translation model to translate them into privacy captions. We present the
initial evaluation of our translation task for ~300,000 code segments
Graph Mining for Cybersecurity: A Survey
The explosive growth of cyber attacks nowadays, such as malware, spam, and
intrusions, caused severe consequences on society. Securing cyberspace has
become an utmost concern for organizations and governments. Traditional Machine
Learning (ML) based methods are extensively used in detecting cyber threats,
but they hardly model the correlations between real-world cyber entities. In
recent years, with the proliferation of graph mining techniques, many
researchers investigated these techniques for capturing correlations between
cyber entities and achieving high performance. It is imperative to summarize
existing graph-based cybersecurity solutions to provide a guide for future
studies. Therefore, as a key contribution of this paper, we provide a
comprehensive review of graph mining for cybersecurity, including an overview
of cybersecurity tasks, the typical graph mining techniques, and the general
process of applying them to cybersecurity, as well as various solutions for
different cybersecurity tasks. For each task, we probe into relevant methods
and highlight the graph types, graph approaches, and task levels in their
modeling. Furthermore, we collect open datasets and toolkits for graph-based
cybersecurity. Finally, we outlook the potential directions of this field for
future research
DexBERT: Effective, Task-Agnostic and Fine-Grained Representation Learning of Android Bytecode
peer reviewedThe automation of an increasingly large number of software engineering tasks is becoming possible thanks to Machine Learning (ML). One foundational building block in the application of ML to software artifacts is the representation of these artifacts (e.g., source code or executable code) into a form that is suitable for learning. Traditionally, researchers and practitioners have relied on manually selected features, based on expert knowledge, for the task at hand. Such knowledge is sometimes imprecise and generally incomplete. To overcome this limitation, many studies have leveraged representation learning, delegating to ML itself the job of automatically devising suitable representations and selections of the most relevant features. Yet, in the context of Android problems, existing models are either limited to coarse-grained whole-app level (e.g., apk2vec) or conducted for one specific downstream task (e.g., smali2vec). Thus, the produced representation may turn out to be unsuitable for fine-grained tasks or cannot generalize beyond the task that they have been trained on. Our work is part of a new line of research that investigates effective, task-agnostic, and fine-grained universal representations of bytecode to mitigate both of these two limitations. Such representations aim to capture information relevant to various low-level downstream tasks (e.g., at the class-level). We are inspired by the field of Natural Language Processing, where the problem of universal representation was addressed by building Universal Language Models, such as BERT, whose goal is to capture abstract semantic information about sentences, in a way that is reusable for a variety of tasks. We propose DexBERT, a BERT-like Language Model dedicated to representing chunks of DEX bytecode, the main binary format used in Android applications. We empirically assess whether DexBERT is able to model the DEX language and evaluate the suitability of our model in three distinct class-level software engineering tasks: Malicious Code Localization, Defect Prediction, and Component Type Classification. We also experiment with strategies to deal with the problem of catering to apps having vastly different sizes, and we demonstrate one example of using our technique to investigate what information is relevant to a given task
Code clone detection in obfuscated Android apps
The Android operating system has long become one of the main global smartphone operating systems. Both developers and malware authors often reuse code to expedite the process of creating new apps and malware samples. Code cloning is the most common way of reusing code in the process of developing Android apps. Finding code clones through the analysis of Android binary code is a challenging task that becomes more sophisticated when instances of code reuse are non-contiguous, reordered, or intertwined with other code. We introduce an approach for detecting cloned methods as well as small and non-contiguous code clones in obfuscated Android applications by simulating the execution of Android apps and then analyzing the subsequent execution traces. We first validate our approach’s ability on finding different types of code clones on 20 injected clones. Next we validate the resistance of our approach against obfuscation by comparing its results on a set of 1085 apps before and after code obfuscation. We obtain 78-87% similarity between the finding from non-obfuscated applications and four sets of obfuscated applications. We also investigated the presence of code clones among 1603 Android applications. We were able to find 44,776 code clones where 34% of code clones were seen from different applications and the rest are among different versions of an application. We also performed a comparative analysis between the clones found by our approach and the clones detected by Nicad on the source code of applications. Finally, we show a practical application of our approach for detecting variants of Android banking malware. Among 60,057 code clone clusters that are found among a dataset of banking malware, 92.9% of them were unique to one malware family or benign applications
A Software Vulnerabilities Odysseus: Analysis, Detection, and Mitigation
Programming has become central in the development of human activities while not
being immune to defaults, or bugs. Developers have developed specific methods and
sequences of tests that they implement to prevent these bugs from being deployed in
releases. Nonetheless, not all cases can be thought through beforehand, and automation
presents limits the community attempts to overcome. As a consequence, not all bugs
can be caught.
These defaults are causing particular concerns in case bugs can be exploited to
breach the program’s security policy. They are then called vulnerabilities and provide
specific actors with undesired access to the resources a program manages. It damages
the trust in the program and in its developers, and may eventually impact the adoption
of the program. Hence, to attribute a specific attention to vulnerabilities appears as a
natural outcome. In this regard, this PhD work targets the following three challenges:
(1) The research community references those vulnerabilities, categorises them, reports
and ranks their impact. As a result, analysts can learn from past vulnerabilities in
specific programs and figure out new ideas to counter them. Nonetheless, the resulting
quality of the lessons and the usefulness of ensuing solutions depend on the quality and
the consistency of the information provided in the reports.
(2) New methods to detect vulnerabilities can emerge among the teachings this
monitoring provides. With responsible reporting, these detection methods can provide
hardening of the programs we rely on. Additionally, in a context of computer perfor-
mance gain, machine learning algorithms are increasingly adopted, providing engaging
promises.
(3) If some of these promises can be fulfilled, not all are not reachable today.
Therefore a complementary strategy needs to be adopted while vulnerabilities evade
detection up to public releases. Instead of preventing their introduction, programs can
be hardened to scale down their exploitability. Increasing the complexity to exploit
or lowering the impact below specific thresholds makes the presence of vulnerabilities
an affordable risk for the feature provided. The history of programming development
encloses the experimentation and the adoption of so-called defence mechanisms. Their
goals and performances can be diverse, but their implementation in worldwide adopted
programs and systems (such as the Android Open Source Project) acknowledges their
pivotal position.
To face these challenges, we provide the following contributions:
• We provide a manual categorisation of the vulnerabilities of the worldwide adopted
Android Open Source Project up to June 2020. Clarifying to adopt a vulnera-
bility analysis provides consistency in the resulting data set. It facilitates the
explainability of the analyses and sets up for the updatability of the resulting
set of vulnerabilities. Based on this analysis, we study the evolution of AOSP’s
vulnerabilities. We explore the different temporal evolutions of the vulnerabilities affecting the system for their severity, the type of vulnerability, and we provide a
focus on memory corruption-related vulnerabilities.
• We undertake the replication of a machine-learning based detection algorithms
that, besides being part of the state-of-the-art and referenced to by ensuing works,
was not available. Named VCCFinder, this algorithm implements a Support-
Vector Machine and bases its training on Vulnerability-Contributing Commits
and related patches for C and C++ code. Not in capacity to achieve analogous
performances to the original article, we explore parameters and algorithms, and
attempt to overcome the challenge provided by the over-population of unlabeled
entries in the data set. We provide the community with our code and results as a
replicable baseline for further improvement.
• We eventually list the defence mechanisms that the Android Open Source Project
incrementally implements, and we discuss how it sometimes answers comments
the community addressed to the project’s developers. We further verify the extent
to which specific memory corruption defence mechanisms were implemented in the
binaries of different versions of Android (from API-level 10 to 28). We eventually
confront the evolution of memory corruption-related vulnerabilities with the
implementation timeline of related defence mechanisms
A Pre-Trained BERT Model for Android Applications
The automation of an increasingly large number of software engineering tasks
is becoming possible thanks to Machine Learning (ML). One foundational building
block in the application of ML to software artifacts is the representation of
these artifacts (e.g., source code or executable code) into a form that is
suitable for learning. Many studies have leveraged representation learning,
delegating to ML itself the job of automatically devising suitable
representations. Yet, in the context of Android problems, existing models are
either limited to coarse-grained whole-app level (e.g., apk2vec) or conducted
for one specific downstream task (e.g., smali2vec). Our work is part of a new
line of research that investigates effective, task-agnostic, and fine-grained
universal representations of bytecode to mitigate both of these two
limitations. Such representations aim to capture information relevant to
various low-level downstream tasks (e.g., at the class-level). We are inspired
by the field of Natural Language Processing, where the problem of universal
representation was addressed by building Universal Language Models, such as
BERT, whose goal is to capture abstract semantic information about sentences,
in a way that is reusable for a variety of tasks. We propose DexBERT, a
BERT-like Language Model dedicated to representing chunks of DEX bytecode, the
main binary format used in Android applications. We empirically assess whether
DexBERT is able to model the DEX language and evaluate the suitability of our
model in two distinct class-level software engineering tasks: Malicious Code
Localization and Defect Prediction. We also experiment with strategies to deal
with the problem of catering to apps having vastly different sizes, and we
demonstrate one example of using our technique to investigate what information
is relevant to a given task