514 research outputs found

    Understanding Android Obfuscation Techniques: A Large-Scale Investigation in the Wild

    Get PDF
    In this paper, we seek to better understand Android obfuscation and depict a holistic view of the usage of obfuscation through a large-scale investigation in the wild. In particular, we focus on four popular obfuscation approaches: identifier renaming, string encryption, Java reflection, and packing. To obtain the meaningful statistical results, we designed efficient and lightweight detection models for each obfuscation technique and applied them to our massive APK datasets (collected from Google Play, multiple third-party markets, and malware databases). We have learned several interesting facts from the result. For example, malware authors use string encryption more frequently, and more apps on third-party markets than Google Play are packed. We are also interested in the explanation of each finding. Therefore we carry out in-depth code analysis on some Android apps after sampling. We believe our study will help developers select the most suitable obfuscation approach, and in the meantime help researchers improve code analysis systems in the right direction

    StoryDroid: Automated Generation of Storyboard for Android Apps

    Full text link
    Mobile apps are now ubiquitous. Before developing a new app, the development team usually endeavors painstaking efforts to review many existing apps with similar purposes. The review process is crucial in the sense that it reduces market risks and provides inspiration for app development. However, manual exploration of hundreds of existing apps by different roles (e.g., product manager, UI/UX designer, developer) in a development team can be ineffective. For example, it is difficult to completely explore all the functionalities of the app in a short period of time. Inspired by the conception of storyboard in movie production, we propose a system, StoryDroid, to automatically generate the storyboard for Android apps, and assist different roles to review apps efficiently. Specifically, StoryDroid extracts the activity transition graph and leverages static analysis techniques to render UI pages to visualize the storyboard with the rendered pages. The mapping relations between UI pages and the corresponding implementation code (e.g., layout code, activity code, and method hierarchy) are also provided to users. Our comprehensive experiments unveil that StoryDroid is effective and indeed useful to assist app development. The outputs of StoryDroid enable several potential applications, such as the recommendation of UI design and layout code

    Code clone detection in obfuscated Android apps

    Get PDF
    The Android operating system has long become one of the main global smartphone operating systems. Both developers and malware authors often reuse code to expedite the process of creating new apps and malware samples. Code cloning is the most common way of reusing code in the process of developing Android apps. Finding code clones through the analysis of Android binary code is a challenging task that becomes more sophisticated when instances of code reuse are non-contiguous, reordered, or intertwined with other code. We introduce an approach for detecting cloned methods as well as small and non-contiguous code clones in obfuscated Android applications by simulating the execution of Android apps and then analyzing the subsequent execution traces. We first validate our approach’s ability on finding different types of code clones on 20 injected clones. Next we validate the resistance of our approach against obfuscation by comparing its results on a set of 1085 apps before and after code obfuscation. We obtain 78-87% similarity between the finding from non-obfuscated applications and four sets of obfuscated applications. We also investigated the presence of code clones among 1603 Android applications. We were able to find 44,776 code clones where 34% of code clones were seen from different applications and the rest are among different versions of an application. We also performed a comparative analysis between the clones found by our approach and the clones detected by Nicad on the source code of applications. Finally, we show a practical application of our approach for detecting variants of Android banking malware. Among 60,057 code clone clusters that are found among a dataset of banking malware, 92.9% of them were unique to one malware family or benign applications

    Advanced Techniques to Detect Complex Android Malware

    Get PDF
    Android is currently the most popular operating system for mobile devices in the world. However, its openness is the main reason for the majority of malware to be targeting Android devices. Various approaches have been developed to detect malware. Unfortunately, new breeds of malware utilize sophisticated techniques to defeat malware detectors. For example, to defeat signature-based detectors, malware authors change the malware’s signatures to avoid detection. As such, a more effective approach to detect malware is by leveraging malware’s behavioral characteristics. However, if a behavior-based detector is based on static analysis, its reported results may contain a large number of false positives. In real-world usage, completing static analysis within a short time budget can also be challenging. Because of the time constraint, analysts adopt approaches based on dynamic analyses to detect malware. However, dynamic analysis is inherently unsound as it only reports analysis results of the executed paths. Besides, recently discovered malware also employs structure-changing obfuscation techniques to evade detection by state-of-the-art systems. Obfuscation allows malware authors to redistribute known malware samples by changing their structures. These factors motivate a need for malware detection systems that are efficient, effective, and resilient when faced with such evasive tactics. In this dissertation, we describe the developments of three malware detection systems to detect complex malware: DroidClassifier, GranDroid, and Obfusifier. DroidClassifier is a systematic framework for classifying network traffic generated by mobile malware. GranDroid is a graph-based malware detection system that combines dynamic analysis, incremental and partial static analysis, and machine learning to provide time-sensitive malicious network behavior detection with high accuracy. Obfusifier is a highly effective machine-learning-based malware detection system that can sustain its effectiveness even when malware authors obfuscate these malicious apps using complex and composite techniques. Our empirical evaluations reveal that DroidClassifier can successfully identify different families of malware with 94.33% accuracy on average. We have also shown GranDroid is quite effective in detecting network-related malware. It achieves 93.0% accuracy, which outperforms other related systems. Lastly, we demonstrate that Obfusifier can achieve 95% precision, recall, and F-measure, collaborating its resilience to complex obfuscation techniques. Adviser: Qiben Yan and Witawas Srisa-a

    Malware Analysis and Privacy Policy Enforcement Techniques for Android Applications

    Get PDF
    The rapid increase in mobile malware and deployment of over-privileged applications over the years has been of great concern to the security community. Encroaching on user’s privacy, mobile applications (apps) increasingly exploit various sensitive data on mobile devices. The information gathered by these applications is sufficient to uniquely and accurately profile users and can cause tremendous personal and financial damage. On Android specifically, the security and privacy holes in the operating system and framework code has created a whole new dynamic for malware and privacy exploitation. This research work seeks to develop novel analysis techniques that monitor Android applications for possible unwanted behaviors and then suggest various ways to deal with the privacy leaks associated with them. Current state-of-the-art static malware analysis techniques on Android-focused mainly on detecting known variants without factoring any kind of software obfuscation. The dynamic analysis systems, on the other hand, are heavily dependent on extending the Android OS and/or runtime virtual machine. These methodologies often tied the system to a single Android version and/or kernel making it very difficult to port to a new device. In privacy, accesses to the database system’s objects are not controlled by any security check beyond overly-broad read/write permissions. This flawed model exposes the database contents to abuse by privacy-agnostic apps and malware. This research addresses the problems above in three ways. First, we developed a novel static analysis technique that fingerprints known malware based on three-level similarity matching. It scores similarity as a function of normalized opcode sequences found in sensitive functional modules and application permission requests. Our system has an improved detection ratio over current research tools and top COTS anti-virus products while maintaining a high level of resiliency to both simple and complex obfuscation. Next, we augment the signature-related weaknesses of our static classifier with a hybrid analysis system which incorporates bytecode instrumentation and dynamic runtime monitoring to examine unknown malware samples. Using the concept of Aspect-oriented programming, this technique involves recompiling security checking code into an unknown binary for data flow analysis, resource abuse tracing, and analytics of other suspicious behaviors. Our system logs all the intercepted activities dynamically at runtime without the need for building custom kernels. Finally, we designed a user-level privacy policy enforcement system that gives users more control over their personal data saved in the SQLite database. Using bytecode weaving for query re-writing and enforcing access control, our system forces new policies at the schema, column, and entity levels of databases without rooting or voiding device warranty

    Android application forensics: A survey of obfuscation, obfuscation detection and deobfuscation techniques and their impact on investigations

    Get PDF
    Android obfuscation techniques include not only classic code obfuscation techniques that were adapted to Android, but also obfuscation methods that target the Android platform specifically. This work examines the status-quo of Android obfuscation, obfuscation detection and deobfuscation. Specifically, it first summarizes obfuscation approaches that are commonly used by app developers for code optimization, to protect their software against code theft and code tampering but are also frequently misused by malware developers to circumvent anti-malware products. Secondly, the article focuses on obfuscation detection techniques and presents various available tools and current research. Thirdly, deobfuscation (which aims at reinstating the original state before obfuscation) is discussed followed by a brief discussion how this impacts forensic investigation. We conclude that although obfuscation is widely used in Android app development (benign and malicious), available tools and the practices on how to deal with obfuscation are not standardized, and so are inherently lacking from a forensic standpoint

    Resilient and Scalable Android Malware Fingerprinting and Detection

    Get PDF
    Malicious software (Malware) proliferation reaches hundreds of thousands daily. The manual analysis of such a large volume of malware is daunting and time-consuming. The diversity of targeted systems in terms of architecture and platforms compounds the challenges of Android malware detection and malware in general. This highlights the need to design and implement new scalable and robust methods, techniques, and tools to detect Android malware. In this thesis, we develop a malware fingerprinting framework to cover accurate Android malware detection and family attribution. In this context, we emphasize the following: (i) the scalability over a large malware corpus; (ii) the resiliency to common obfuscation techniques; (iii) the portability over different platforms and architectures. In the context of bulk and offline detection on the laboratory/vendor level: First, we propose an approximate fingerprinting technique for Android packaging that captures the underlying static structure of the Android apps. We also propose a malware clustering framework on top of this fingerprinting technique to perform unsupervised malware detection and grouping by building and partitioning a similarity network of malicious apps. Second, we propose an approximate fingerprinting technique for Android malware's behavior reports generated using dynamic analyses leveraging natural language processing techniques. Based on this fingerprinting technique, we propose a portable malware detection and family threat attribution framework employing supervised machine learning techniques. Third, we design an automatic framework to produce intelligence about the underlying malicious cyber-infrastructures of Android malware. We leverage graph analysis techniques to generate relevant, actionable, and granular intelligence that can be used to identify the threat effects induced by malicious Internet activity associated to Android malicious apps. In the context of the single app and online detection on the mobile device level, we further propose the following: Fourth, we design a portable and effective Android malware detection system that is suitable for deployment on mobile and resource constrained devices, using machine learning classification on raw method call sequences. Fifth, we elaborate a framework for Android malware detection that is resilient to common code obfuscation techniques and adaptive to operating systems and malware change overtime, using natural language processing and deep learning techniques. We also evaluate the portability of the proposed techniques and methods beyond Android platform malware, as follows: Sixth, we leverage the previously elaborated techniques to build a framework for cross-platform ransomware fingerprinting relying on raw hybrid features in conjunction with advanced deep learning techniques
    corecore