16 research outputs found

    Op2Vec: An Opcode Embedding Technique and Dataset Design for End-to-End Detection of Android Malware

    Full text link
    Android is one of the leading operating systems for smart phones in terms of market share and usage. Unfortunately, it is also an appealing target for attackers to compromise its security through malicious applications. To tackle this issue, domain experts and researchers are trying different techniques to stop such attacks. All the attempts of securing Android platform are somewhat successful. However, existing detection techniques have severe shortcomings, including the cumbersome process of feature engineering. Designing representative features require expert domain knowledge. There is a need for minimizing human experts' intervention by circumventing handcrafted feature engineering. Deep learning could be exploited by extracting deep features automatically. Previous work has shown that operational codes (opcodes) of executables provide key information to be used with deep learning models for detection process of malicious applications. The only challenge is to feed opcodes information to deep learning models. Existing techniques use one-hot encoding to tackle the challenge. However, the one-hot encoding scheme has severe limitations. In this paper, we introduce; (1) a novel technique for opcodes embedding, which we name Op2Vec, (2) based on the learned Op2Vec we have developed a dataset for end-to-end detection of android malware. Introducing the end-to-end Android malware detection technique avoids expert-intensive handcrafted features extraction, and ensures automation. Some of the recent deep learning-based techniques showed significantly improved results when tested with the proposed approach and achieved an average detection accuracy of 97.47%, precision of 0.976 and F1 score of 0.979

    Applying Deep Learning Techniques to the Analysis of Android APKs

    Get PDF
    Malware targeting mobile devices is a pervasive problem in modern life and as such tools to detect and classify malware are of great value. This paper seeks to demonstrate the effectiveness of Deep Learning Techniques, specifically Convolutional Neural Networks, in detecting and classifying malware targeting the Android operating system. Unlike many current detection techniques, which require the use of relatively rigid features to aid in detection, deep neural networks are capable of automatically learning flexible features which may be more resilient to obfuscation. We present a parsing for extracting sequences of API calls which can be used to describe a hypothetical execution of a given application. We then show how to use this sequence of API calls to successfully classify Android malware using a Convolutional Neural Network

    ACTS: Extracting Android App Topological Signature through Graphlet Sampling

    Get PDF
    Android systems are widely used in mobile & wireless distributed systems. In the near future, Android is believed to dominate the mobile distributed environment. However, with the popularity of Android-based smartphones/tablets comes the rampancy of Android-based malware. In this paper, we propose a novel topological signature of Android apps based on the function call graphs (FCGs) extracted from their Android App Packages (APKs). Specifically, by leveraging recent advances in graphlet sampling, the proposed method fully captures the invocator-invocatee relationship at local neighborhoods in an FCG without exponentially inflating the state space. Using real benign app and malware samples, we demonstrate that our method, ACTS (App topologiCal signature through graphleT Sampling), can detect malware and identify malware families robustly and efficiently. More importantly, we demonstrate that, without augmenting the FCG with any semantic features such as bytecode-based vertex typing, local topological information captured by ACTS alone can achieve a high malware detection accuracy. Since ACTS only uses structural features, which are orthogonal to semantic features, it is expected that combining them would give a greater improvement in malware detection accuracy than combining non-orthogonal semantic features

    A Survey on Malware Detection with Graph Representation Learning

    Full text link
    Malware detection has become a major concern due to the increasing number and complexity of malware. Traditional detection methods based on signatures and heuristics are used for malware detection, but unfortunately, they suffer from poor generalization to unknown attacks and can be easily circumvented using obfuscation techniques. In recent years, Machine Learning (ML) and notably Deep Learning (DL) achieved impressive results in malware detection by learning useful representations from data and have become a solution preferred over traditional methods. More recently, the application of such techniques on graph-structured data has achieved state-of-the-art performance in various domains and demonstrates promising results in learning more robust representations from malware. Yet, no literature review focusing on graph-based deep learning for malware detection exists. In this survey, we provide an in-depth literature review to summarize and unify existing works under the common approaches and architectures. We notably demonstrate that Graph Neural Networks (GNNs) reach competitive results in learning robust embeddings from malware represented as expressive graph structures, leading to an efficient detection by downstream classifiers. This paper also reviews adversarial attacks that are utilized to fool graph-based detection methods. Challenges and future research directions are discussed at the end of the paper.Comment: Preprint, submitted to ACM Computing Surveys on March 2023. For any suggestions or improvements, please contact me directly by e-mai

    A Survey and Evaluation of Android-Based Malware Evasion Techniques and Detection Frameworks

    Get PDF
    Android platform security is an active area of research where malware detection techniques continuously evolve to identify novel malware and improve the timely and accurate detection of existing malware. Adversaries are constantly in charge of employing innovative techniques to avoid or prolong malware detection effectively. Past studies have shown that malware detection systems are susceptible to evasion attacks where adversaries can successfully bypass the existing security defenses and deliver the malware to the target system without being detected. The evolution of escape-resistant systems is an open research problem. This paper presents a detailed taxonomy and evaluation of Android-based malware evasion techniques deployed to circumvent malware detection. The study characterizes such evasion techniques into two broad categories, polymorphism and metamorphism, and analyses techniques used for stealth malware detection based on the malware’s unique characteristics. Furthermore, the article also presents a qualitative and systematic comparison of evasion detection frameworks and their detection methodologies for Android-based malware. Finally, the survey discusses open-ended questions and potential future directions for continued research in mobile malware detection

    Advanced Topics in Systems Safety and Security

    Get PDF
    This book presents valuable research results in the challenging field of systems (cyber)security. It is a reprint of the Information (MDPI, Basel) - Special Issue (SI) on Advanced Topics in Systems Safety and Security. The competitive review process of MDPI journals guarantees the quality of the presented concepts and results. The SI comprises high-quality papers focused on cutting-edge research topics in cybersecurity of computer networks and industrial control systems. The contributions presented in this book are mainly the extended versions of selected papers presented at the 7th and the 8th editions of the International Workshop on Systems Safety and Security—IWSSS. These two editions took place in Romania in 2019 and respectively in 2020. In addition to the selected papers from IWSSS, the special issue includes other valuable and relevant contributions. The papers included in this reprint discuss various subjects ranging from cyberattack or criminal activities detection, evaluation of the attacker skills, modeling of the cyber-attacks, and mobile application security evaluation. Given this diversity of topics and the scientific level of papers, we consider this book a valuable reference for researchers in the security and safety of systems

    Security Issues of Mobile and Smart Wearable Devices

    Get PDF
    Mobile and smart devices (ranging from popular smartphones and tablets to wearable fitness trackers equipped with sensing, computing and networking capabilities) have proliferated lately and redefined the way users carry out their day-to-day activities. These devices bring immense benefits to society and boast improved quality of life for users. As mobile and smart technologies become increasingly ubiquitous, the security of these devices becomes more urgent, and users should take precautions to keep their personal information secure. Privacy has also been called into question as so many of mobile and smart devices collect, process huge quantities of data, and store them on the cloud as a matter of fact. Ensuring confidentiality, integrity, and authenticity of the information is a cybersecurity challenge with no easy solution. Unfortunately, current security controls have not kept pace with the risks posed by mobile and smart devices, and have proven patently insufficient so far. Thwarting attacks is also a thriving research area with a substantial amount of still unsolved problems. The pervasiveness of smart devices, the growing attack vectors, and the current lack of security call for an effective and efficient way of protecting mobile and smart devices. This thesis deals with the security problems of mobile and smart devices, providing specific methods for improving current security solutions. Our contributions are grouped into two related areas which present natural intersections and corresponds to the two central parts of this document: (1) Tackling Mobile Malware, and (2) Security Analysis on Wearable and Smart Devices. In the first part of this thesis, we study methods and techniques to assist security analysts to tackle mobile malware and automate the identification of malicious applications. We provide threefold contributions in tackling mobile malware: First, we introduce a Secure Message Delivery (SMD) protocol for Device-to-Device (D2D) networks, with primary objective of choosing the most secure path to deliver a message from a sender to a destination in a multi-hop D2D network. Second, we illustrate a survey to investigate concrete and relevant questions concerning Android code obfuscation and protection techniques, where the purpose is to review code obfuscation and code protection practices. We evaluate efficacy of existing code de-obfuscation tools to tackle obfuscated Android malware (which provide attackers with the ability to evade detection mechanisms). Finally, we propose a Machine Learning-based detection framework to hunt malicious Android apps by introducing a system to detect and classify newly-discovered malware through analyzing applications. The proposed system classifies different types of malware from each other and helps to better understanding how malware can infect devices, the threat level they pose and how to protect against them. Our designed system leverages more complete coverage of apps’ behavioral characteristics than the state-of-the-art, integrates the most performant classifier, and utilizes the robustness of extracted features. The second part of this dissertation conducts an in-depth security analysis of the most popular wearable fitness trackers on the market. Our contributions are grouped into four central parts in this domain: First, we analyze the primitives governing the communication between fitness tracker and cloud-based services. In addition, we investigate communication requirements in this setting such as: (i) Data Confidentiality, (ii) Data Integrity, and (iii) Data Authenticity. Second, we show real-world demos on how modern wearable devices are vulnerable to false data injection attacks. Also, we document successful injection of falsified data to cloud-based services that appears legitimate to the cloud to obtain personal benefits. Third, we circumvent End-to-End protocol encryption implemented in the most advanced and secure fitness trackers (e.g., Fitbit, as the market leader) through Hardware-based reverse engineering. Last but not least, we provide guidelines for avoiding similar vulnerabilities in future system designs

    Latent Representation and Sampling in Network: Application in Text Mining and Biology.

    Get PDF
    In classical machine learning, hand-designed features are used for learning a mapping from raw data. However, human involvement in feature design makes the process expensive. Representation learning aims to learn abstract features directly from data without direct human involvement. Raw data can be of various forms. Network is one form of data that encodes relational structure in many real-world domains. Therefore, learning abstract features for network units is an important task. In this dissertation, we propose models for incorporating temporal information given as a collection of networks from subsequent time-stamps. The primary objective of our models is to learn a better abstract feature representation of nodes and edges in an evolving network. We show that the temporal information in the abstract feature improves the performance of link prediction task substantially. Besides applying to the network data, we also employ our models to incorporate extra-sentential information in the text domain for learning better representation of sentences. We build a context network of sentences to capture extra-sentential information. This information in abstract feature representation of sentences improves various text-mining tasks substantially over a set of baseline methods. A problem with the abstract features that we learn is that they lack interpretability. In real-life applications on network data, for some tasks, it is crucial to learn interpretable features in the form of graphical structures. For this we need to mine important graphical structures along with their frequency statistics from the input dataset. However, exact algorithms for these tasks are computationally expensive, so scalable algorithms are of urgent need. To overcome this challenge, we provide efficient sampling algorithms for mining higher-order structures from network(s). We show that our sampling-based algorithms are scalable. They are also superior to a set of baseline algorithms in terms of retrieving important graphical sub-structures, and collecting their frequency statistics. Finally, we show that we can use these frequent subgraph statistics and structures as features in various real-life applications. We show one application in biology and another in security. In both cases, we show that the structures and their statistics significantly improve the performance of knowledge discovery tasks in these domains
    corecore