17 research outputs found
Looking for Criminal Intents in JavaScript Obfuscated Code
The majority of websites incorporate JavaScript for client-side execution in a supposedly protected environment. Unfortunately, JavaScript has also proven to be a critical attack vector for both independent and state-sponsored groups of hackers. On the one hand, defenders need to analyze scripts to ensure that no threat is delivered and to respond to potential security incidents. On the other, attackers aim to obfuscate the source code in order to disorient the defenders or even to make code analysis practically impossible. Since code obfuscation may also be adopted by companies for legitimate intellectual-property protection, a dilemma remains on whether a script is harmless or malignant, if not criminal. To help analysts deal with such a dilemma, a methodology is proposed, called JACOB, which is based on five steps, namely: (1) source code parsing, (2) control flow graph recovery, (3) region identification, (4) code structuring, and (5) partial evaluation. These steps implement a sort of decompilation for control flow flattened code, which is progressively transformed into something that is close to the original JavaScript source, thereby making eventual code analysis possible. Most relevantly, JACOB has been successfully applied to uncover unwanted user tracking and fingerprinting in e-commerce websites operated by a well-known Chinese company
Evaluation Methodologies in Software Protection Research
Man-at-the-end (MATE) attackers have full control over the system on which
the attacked software runs, and try to break the confidentiality or integrity
of assets embedded in the software. Both companies and malware authors want to
prevent such attacks. This has driven an arms race between attackers and
defenders, resulting in a plethora of different protection and analysis
methods. However, it remains difficult to measure the strength of protections
because MATE attackers can reach their goals in many different ways and a
universally accepted evaluation methodology does not exist. This survey
systematically reviews the evaluation methodologies of papers on obfuscation, a
major class of protections against MATE attacks. For 572 papers, we collected
113 aspects of their evaluation methodologies, ranging from sample set types
and sizes, over sample treatment, to performed measurements. We provide
detailed insights into how the academic state of the art evaluates both the
protections and analyses thereon. In summary, there is a clear need for better
evaluation methodologies. We identify nine challenges for software protection
evaluations, which represent threats to the validity, reproducibility, and
interpretation of research results in the context of MATE attacks
Code clone detection in obfuscated Android apps
The Android operating system has long become one of the main global smartphone operating systems. Both developers and malware authors often reuse code to expedite the process of creating new apps and malware samples. Code cloning is the most common way of reusing code in the process of developing Android apps. Finding code clones through the analysis of Android binary code is a challenging task that becomes more sophisticated when instances of code reuse are non-contiguous, reordered, or intertwined with other code. We introduce an approach for detecting cloned methods as well as small and non-contiguous code clones in obfuscated Android applications by simulating the execution of Android apps and then analyzing the subsequent execution traces. We first validate our approachâs ability on finding different types of code clones on 20 injected clones. Next we validate the resistance of our approach against obfuscation by comparing its results on a set of 1085 apps before and after code obfuscation. We obtain 78-87% similarity between the finding from non-obfuscated applications and four sets of obfuscated applications. We also investigated the presence of code clones among 1603 Android applications. We were able to find 44,776 code clones where 34% of code clones were seen from different applications and the rest are among different versions of an application. We also performed a comparative analysis between the clones found by our approach and the clones detected by Nicad on the source code of applications. Finally, we show a practical application of our approach for detecting variants of Android banking malware. Among 60,057 code clone clusters that are found among a dataset of banking malware, 92.9% of them were unique to one malware family or benign applications
Understanding the behaviour of hackers while performing attack tasks in a professional setting and in a public challenge
When critical assets or functionalities are included in a piece of software accessible to the end users, code protections are used to hinder or delay the extraction or manipulation of such critical assets. The process and strategy followed by hackers to understand and tamper with protected software might differ from program understanding for benign purposes. Knowledge of the actual hacker behaviours while performing real attack tasks can inform better ways to protect the software and can provide more realistic assumptions to the developers, evaluators, and users of software protections. Within Aspire, a software protection research project funded by the EU under framework programme FP7, we have conducted three industrial case studies with the involvement of professional penetration testers and a public challenge consisting of eight attack tasks with open participation. We have applied a systematic qualitative analysis methodology to the hackersâ reports relative to the industrial case studies and the public challenge. The qualitative analysis resulted in 459 and 265 annotations added respectively to the industrial and to the public challenge reports. Based on these annotations we built a taxonomy consisting of 169 concepts. They address the hacker activities related to (i) understanding code; (ii) defining the attack strategy; (iii) selecting and customizing the tools; and (iv) defeating the protections. While there are many commonalities between professional hackers and practitioners, we could spot many fundamental differences. For instance, while industrial professional hackers aim at elaborating automated and reproducible deterministic attacks, practitioners prefer to minimize the effort and try many different manual tasks. This analysis allowed us to distill a number of new research directions and potential improvements for protection techniques. In particular, considering the critical role of analysis tools, protection techniques should explicitly attack them, by exploiting analysis problems and complexity aspects that available automated techniques are bad at addressing
Aprimorando a segurança do Android atravĂ©s de detecção de malware e geração automĂĄtica de polĂticas
Orientadores: Paulo LĂcio de Geus, AndrĂ© Ricardo Abed GrĂ©gioTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Dispositivos mĂłveis tĂȘm evoluĂdo constantemente, recebendo novas funcionalidades e se tornando cada vez mais ubĂquos. Assim, eles se tornaram alvos lucrativos para criminosos. Como Android Ă© a plataforma lĂder em dispositivos mĂłveis, ele se tornou o alvo principal de desenvolvedores de malware. AlĂ©m disso, a quantidade de apps maliciosas encontradas por empresas de segurança que tĂȘm esse sistema operacional como alvo cresceu rapidamente nos Ășltimos anos. Esta tese aborda o problema da segurança de tais dispositivos por dois lados: (i) analisando e identificando apps maliciosas e (ii) desenvolvendo uma polĂtica de segurança que pode restringir a superfĂcie de ataque disponĂvel para cĂłdigo nativo. Para tanto, foi desenvolvido um sistema para analisar apps dinamicamente, monitorando chamadas de API e chamadas de sistema. Destes traços de comportamento extraiu-se atributos, que sĂŁo utilizados por um algoritmo de aprendizado de mĂĄquina para classificar apps como maliciosas ou benignas. Um dos problemas principais de sistemas de anĂĄlise dinĂąmica Ă© que eles possuem muitas diferenças em relação a dispositivos reais, e exemplares de malware podem usar essas caracterĂsticas para identificar se estĂŁo sendo analisados, impedindo assim que as açÔes maliciosas sejam observadas. Para identificar apps maliciosas de Android que evadem anĂĄlises, desenvolveu-se uma tĂ©cnica que compara o comportamento de uma app em um dispositivo real e em um emulador. Identificou-se as açÔes que foram executadas apenas no sistema real e se a divergĂȘncia foi causada por caminhos de cĂłdigo diferentes serem explorados ou por algum erro nĂŁo relacionado. Por fim, realizou-se uma anĂĄlise em larga escala de apps que utilizam cĂłdigo nativo, a fim de se identificar como este Ă© usado por apps legĂtimas e tambĂ©m para se criar uma polĂtica de segurança que restrinja as açÔes de malware que usam este tipo de cĂłdigoAbstract: Mobile devices have been constantly evolving, receiving new functionalities and becoming increasingly ubiquitous. Thus, they became lucrative targets for miscreants. Since Android is the leading platform for mobile devices, it became the most popular choice for malware developers. Moreover, the amount of malicious apps, found by security companies, that target this platform rapidly increased in the last few years. This thesis approaches the security problem of such devices in two ways: (i) by analyzing and identifying malicious apps, and (ii) by developing a sandboxing policy that can restrict the attack surface available to native code. A system was developed to dynamically analyze apps, monitoring API calls and system calls. From these behavior traces attributes were extracted, which are used by a machine learning algorithm to classify apps as malicious or benign. One of the main problems of dynamic analysis systems is that they have many differences compared to real devices, and malware can leverage these characteristics to identify whether they are being analyzed or not, thus being able to prevent the malicious actions from being observed. To identify Android malware that evades analyses, a technique was developed to compare the behavior of an app on a real device and on an emulator. Actions that were only executed in the bare metal system were identified, recognizing whether the divergence was caused by different code paths being explored or by some unrelated error. Finally, a large-scale analysis of apps that use native code was performed, in order to identify how native code is used by benign apps and also to generate a sandboxing policy to restrict malware that use such codeDoutoradoCiĂȘncia da ComputaçãoDoutor em CiĂȘncia da Computação23038.007604/2014-69, 12269/13-1CAPE
Effiziente und erklÀrbare Erkennung von mobiler Schadsoftware mittels maschineller Lernmethoden
In recent years, mobile devices shipped with Googleâs Android operating system
have become ubiquitous. Due to their popularity and the high concentration of
sensitive user data on these devices, however, they have also become a
profitable target of malware authors. As a result, thousands of new malware
instances targeting Android are found almost every day. Unfortunately, common
signature-based methods often fail to detect these applications, as these
methods can- not keep pace with the rapid development of new malware.
Consequently, there is an urgent need for new malware detection methods to
tackle this growing threat.
In this thesis, we address the problem by combining concepts of static analysis
and machine learning, such that mobile malware can be detected directly on the
mobile device with low run-time overhead. To this end, we first discuss our
analysis results of a sophisticated malware that uses an ultrasonic side
channel to spy on unwitting smartphone users. Based on the insights we gain
throughout this thesis, we gradually develop a method that allows detecting
Android malware in general. The resulting method performs a broad static
analysis, gathering a large number of features associated with an application.
These features are embedded in a joint vector space, where typical patterns
indicative of malware can be automatically identified and used for explaining
the decisions of our method. In addition to an evaluation of its overall
detection and run-time performance, we also examine the interpretability of the
underlying detection model and strengthen the classifier against realistic
evasion attacks.
In a large set of experiments, we show that the method clearly outperforms
several related approaches, including popular anti-virus scanners. In most
experiments, our approach detects more than 90% of all malicious samples in the
dataset at a low false positive rate of only 1%. Furthermore, even on older
devices, it offers a good run-time performance, and can output a decision along
with a proper explanation within a few seconds, despite the use of machine
learning techniques directly on the mobile device.
Overall, we find that the application of machine learning techniques is a
promising research direction to improve the security of mobile devices. While
these techniques alone cannot defeat the threat of mobile malware, they at
least raise the bar for malicious actors significantly, especially if combined
with existing techniques.Die Verbreitung von Smartphones, insbesondere mit dem Android-Betriebssystem,
hat in den vergangenen Jahren stark zugenommen. Aufgrund ihrer hohen
PopularitÀt haben sich diese GerÀte jedoch zugleich auch zu einem lukrativen
Ziel fĂŒr Entwickler von Schadsoftware entwickelt, weshalb mittlerweile tĂ€glich
neue Schadprogramme fĂŒr Android gefunden werden.
Obwohl verschiedene Lösungen existieren, die Schadprogramme auch auf mobilen
EndgerÀten identifizieren sollen, bieten diese in der Praxis hÀufig keinen
ausreichenden Schutz. Dies liegt vor allem daran, dass diese Verfahren zumeist
signaturbasiert arbeiten und somit schÀdliche Programme erst zuverlÀssig
identifizieren können, sobald entsprechende Erkennungssignaturen vorhanden
sind. Jedoch wird es fĂŒr Antiviren-Hersteller immer schwieriger, die zur
Erkennung notwendigen Signaturen rechtzeitig bereitzustellen. Daher ist die
Entwicklung von neuen Verfahren nötig, um der wachsenden Bedrohung durch mobile
Schadsoftware besser begegnen zu können.
In dieser Dissertation wird ein Verfahren vorgestellt und eingehend untersucht,
das Techniken der statischen Code-Analyse mit Methoden des maschinellen Lernens
kombiniert, um so eine zuverlÀssige Erkennung von mobiler Schadsoftware direkt
auf dem MobilgerĂ€t zu ermöglichen. Die Methode analysiert hierfĂŒr mobile
Anwendungen zunÀchst statisch und extrahiert dabei spezielle Merkmale, die eine
Abbildung einer Applikation in einen hochdimensionalen Vektorraum ermöglichen.
In diesem Vektorraum sind schlieĂlich maschinelle Lernmethoden in der Lage,
automatisch Muster zur Erkennung von Schadprogrammen zu finden. Die gefundenen
Muster können dabei nicht nur zur Erkennung, sondern darĂŒber hinaus auch zur
ErklÀrung einer getroffenenen Entscheidung dienen.
Im Rahmen einer ausfĂŒhrlichen Evaluation wird nicht nur die Erkennungsleistung
und die Laufzeit der vorgestellten Methode untersucht, sondern darĂŒber hinaus
das gelernte Erkennungsmodell im Detail analysiert. Hierbei wird auch die
Robustheit des Modells gegenĂŒber gezielten Angriffe untersucht und verbessert.
In einer Reihe von Experimenten kann gezeigt werden, dass mit dem
vorgeschlagenen Verfahren bessere Ergebnisse erzielt werden können als mit
vergleichbaren Methoden, sogar einschlieĂlich einiger populĂ€rer
Antivirenprogramme. In den meisten Experimenten kann die Methode Schadprogramme
zuverlĂ€ssig erkennen und erreicht Erkennungsraten von ĂŒber 90% bei einer
geringen Falsch-Positiv-Rate von 1%