286 research outputs found

    Graph-aided directed testing of Android applications for checking runtime privacy behaviours

    Get PDF
    National Research Foundation (NRF) Singapore under Interactive Digital Media Strategic Research Programm

    On Robust Malware Classifiers by Verifying Unwanted Behaviours

    Get PDF

    Man-machine partial program analysis for malware detection

    Get PDF
    With the meteoric rise in popularity of the Android platform, there is an urgent need to combat the accompanying proliferation of malware. Existing work addresses the area of consumer malware detection, but cannot detect novel, sophisticated, domain-specific malware that is targeted specifically at one aspect of an organization (eg. ground operations of the US Military). Adversaries can exploit domain knowledge to camoflauge malice within the legitimate behaviors of an app and behind a domain-specific trigger, rendering traditional approaches such as signature-matching, machine learning, and dynamic monitoring ineffective. Manual code inspections are also inadequate, scaling poorly and introducing human error. Yet, there is a dire need to detect this kind of malware before it causes catastrophic loss of life and property. This dissertation presents the Security Toolbox, our novel solution for this challenging new problem posed by DARPA\u27s Automated Program Analysis for Cybersecurity (APAC) program. We employ a human-in-the-loop approach to amplify the natural intelligence of our analysts. Our automation detects interesting program behaviors and exposes them in an analysis Dashboard, allowing the analyst to brainstorm flaw hypotheses and ask new questions, which in turn can be answered by our automated analysis primitives. The Security Toolbox is built on top of Atlas, a novel program analysis platform made by EnSoft. Atlas uses a graph-based mathematical abstraction of software to produce a unified property multigraph, exposes a powerful API for writing analyzers using graph traversals, and provides both automated and interactive capabilities to facilitate program comprehension. The Security Toolbox is also powered by FlowMiner, a novel solution to mine fine-grained, compact data flow summaries of Java libraries. FlowMiner allows the Security Toolbox to complete a scalable and accurate partial program analysis of an application without including all of the libraries that it uses (eg. Android). This dissertation presents the Security Toolbox, Atlas, and FlowMiner. We provide empirical evidence of the effectiveness of the Security Toolbox for detecting novel, sophisticated, domain-specific Android malware, demonstrating that our approach outperforms other cutting-edge research tools and state-of-the-art commercial programs in both time and accuracy metrics. We also evaluate the effectiveness of Atlas as a program analysis platform and FlowMiner as a library summary tool

    Automated Testing of Android Apps: A Systematic Literature Review

    Get PDF
    Automated testing of Android apps is essential for app users, app developers and market maintainer communities alike. Given the widespread adoption of Android and the specificities of its development model, the literature has proposed various testing approaches for ensuring that not only functional requirements but also non-functional requirements are satisfied. In this paper, we aim at providing a clear overview of the state-of-the-art works around the topic of Android app testing, in an attempt to highlight the main trends, pinpoint the main methodologies applied and enumerate the challenges faced by the Android testing approaches as well as the directions where the community effort is still needed. To this end, we conduct a Systematic Literature Review (SLR) during which we eventually identified 103 relevant research papers published in leading conferences and journals until 2016. Our thorough examination of the relevant literature has led to several findings and highlighted the challenges that Android testing researchers should strive to address in the future. After that, we further propose a few concrete research directions where testing approaches are needed to solve recurrent issues in app updates, continuous increases of app sizes, as well as the Android ecosystem fragmentation

    Efficiently Manifesting Asynchronous Programming Errors in Android Apps

    Full text link
    Android, the #1 mobile app framework, enforces the single-GUI-thread model, in which a single UI thread manages GUI rendering and event dispatching. Due to this model, it is vital to avoid blocking the UI thread for responsiveness. One common practice is to offload long-running tasks into async threads. To achieve this, Android provides various async programming constructs, and leaves developers themselves to obey the rules implied by the model. However, as our study reveals, more than 25% apps violate these rules and introduce hard-to-detect, fail-stop errors, which we term as aysnc programming errors (APEs). To this end, this paper introduces APEChecker, a technique to automatically and efficiently manifest APEs. The key idea is to characterize APEs as specific fault patterns, and synergistically combine static analysis and dynamic UI exploration to detect and verify such errors. Among the 40 real-world Android apps, APEChecker unveils and processes 61 APEs, of which 51 are confirmed (83.6% hit rate). Specifically, APEChecker detects 3X more APEs than the state-of-art testing tools (Monkey, Sapienz and Stoat), and reduces testing time from half an hour to a few minutes. On a specific type of APEs, APEChecker confirms 5X more errors than the data race detection tool, EventRacer, with very few false alarms

    Program Similarity Analysis for Malware Classification and its Pitfalls

    Get PDF
    Malware classification, specifically the task of grouping malware samples into families according to their behaviour, is vital in order to understand the threat they pose and how to protect against them. Recognizing whether one program shares behaviors with another is a task that requires semantic reasoning, meaning that it needs to consider what a program actually does. This is a famously uncomputable problem, due to Rice\u2019s theorem. As there is no one-size-fits-all solution, determining program similarity in the context of malware classification requires different tools and methods depending on what is available to the malware defender. When the malware source code is readily available (or at least, easy to retrieve), most approaches employ semantic \u201cabstractions\u201d, which are computable approximations of the semantics of the program. We consider this the first scenario for this thesis: malware classification using semantic abstractions extracted from the source code in an open system. Structural features, such as the control flow graphs of programs, can be used to classify malware reasonably well. To demonstrate this, we build a tool for malware analysis, R.E.H.A. which targets the Android system and leverages its openness to extract a structural feature from the source code of malware samples. This tool is first successfully evaluated against a state of the art malware dataset and then on a newly collected dataset. We show that R.E.H.A. is able to classify the new samples into their respective families, often outperforming commercial antivirus software. However, abstractions have limitations by virtue of being approximations. We show that by increasing the granularity of the abstractions used to produce more fine-grained features, we can improve the accuracy of the results as in our second tool, StranDroid, which generates fewer false positives on the same datasets. The source code of malware samples is not often available or easily retrievable. For this reason, we introduce a second scenario in which the classification must be carried out with only the compiled binaries of malware samples on hand. Program similarity in this context cannot be done using semantic abstractions as before, since it is difficult to create meaningful abstractions from zeros and ones. Instead, by treating the compiled programs as raw data, we transform them into images and build upon common image classification algorithms using machine learning. This led us to develop novel deep learning models, a convolutional neural network and a long short-term memory, to classify the samples into their respective families. To overcome the usual obstacle of deep learning of lacking sufficiently large and balanced datasets, we utilize obfuscations as a data augmentation tool to generate semantically equivalent variants of existing samples and expand the dataset as needed. Finally, to lower the computational cost of the training process, we use transfer learning and show that a model trained on one dataset can be used to successfully classify samples in different malware datasets. The third scenario explored in this thesis assumes that even the binary itself cannot be accessed for analysis, but it can be executed, and the execution traces can then be used to extract semantic properties. However, dynamic analysis lacks the formal tools and frameworks that exist in static analysis to allow proving the effectiveness of obfuscations. For this reason, the focus shifts to building a novel formal framework that is able to assess the potency of obfuscations against dynamic analysis. We validate the new framework by using it to encode known analyses and obfuscations, and show how these obfuscations actually hinder the dynamic analysis process

    Runtime protection of software programs against control- and data-oriented attacks

    Get PDF
    Software programs are everywhere and continue to create value for us at an incredible pace. But this comes at the cost of facing new risks as our well-being and the stability of societies become strongly dependent on their correctness. Even if the software loaded in the memory is considered legitimate or benign, this does not mean that the code will execute as expected at runtime. Software programs, particularly the ones developed in unsafe languages (e.g., C/C++), inevitably contain many memory bugs. Attackers exploiting these bugs can achieve malicious computations outside the original specification of the program by corrupting its control and data variables in the memory. A potential solution to such runtime attacks must either ensure the integrity of those variables or check the validity of the values they hold. A complete version of the former method, which requires inspection of all memory accesses, can eliminate all the performance benefits of the language used. Alternatively, checking whether specific variables constitute a legitimate state is a non-trivial task that needs to handle state explosion and over-approximation issues. Regardless of the method preferred, most runtime protections are subject to common challenges. For example, as the scope of protection widens, such as the inclusion of data-oriented attacks (in addition to control-oriented attacks), performance costs inevitably increase as well. This is especially true for software-based methods that also suffer from weaker security guarantees. On the contrary, most hardware-based techniques promise better security and performance. But they face substantial deployment challenges without offering any solution to existing devices already out there. In this thesis, we aim to tackle these research challenges by delivering multiple runtime protections in different settings. First, the thesis presents the design of a non-invasive hardware module that can enable attesting runtime correctness on critical embedded systems in real-time. Second, we address the performance burden of covering data-oriented attacks, by suggesting a novel technique to distinguish critical variables from those that are unlikely to be attacked. This is to develop a selective protection scheme with practical performance overheads, without having to check all data variables or corresponding memory accesses. Third, the thesis presents a software-based solution that promises hardware-level protection for critical variables. For this purpose, it leverages the CPU registers available in any architecture with extra help from cryptography. Lastly, we explore the use of runtime interactions with the operating system to identify malicious software executions
    • …
    corecore