79 research outputs found

    SafeDroid: A Distributed Malware Detection Service for Android

    Get PDF
    Android platform has become a primary target for malware. In this paper we present SafeDroid, an open source distributed service to detect malicious apps on Android by combining static analysis and machine learning techniques. It is composed by three micro-services, working together, combining static analysis and machine learning techniques. SafeDroid has been designed as a user friendly service, providing detailed feedback in case of malware detection. The detection service is optimized to be lightweight and easily updated. The feature set on which the micro-service of detection relies on on has been selected and optimized in order to focus only on the most distinguishing characteristics of the Android apps. We present a prototype to show the effectiveness of the detection mechanism service and the feasibility of the approach

    AndroParse - An Android Feature Extraction Framework & Dataset

    Get PDF
    Android malware has become a major challenge. As a consequence, practitioners and researchers spend a significant time analyzing Android applications (APK). A common procedure (especially for data scientists) is to extract features such as permissions, APIs or strings which can then be analyzed. Current state of the art tools have three major issues: (1) a single tool cannot extract all the significant features used by scientists and practitioners (2) Current tools are not designed to be extensible and (3) Existing parsers do not have runtime efficiency. Therefore, this work presents AndroParse which is an open-source Android parser written in Golang that currently extracts the four most common features: Permissions, APIs, Strings and Intents. AndroParse outputs JSON files as they can easily be used by most major programming languages. Constructing the parser allowed us to create an extensive feature dataset which can be accessed by our independent REST API. Our dataset currently has 67,703 benign and 46,683 malicious APK samples

    An analysis of android malware classification services

    Get PDF
    The increasing number of Android malware forced antivirus (AV) companies to rely on automated classification techniques to determine the family and class of suspicious samples. The research community relies heavily on such labels to carry out prevalence studies of the threat ecosystem and to build datasets that are used to validate and benchmark novel detection and classification methods. In this work, we carry out an extensive study of the Android malware ecosystem by surveying white papers and reports from 6 key players in the industry, as well as 81 papers from 8 top security conferences, to understand how malware datasets are used by both. We, then, explore the limitations associated with the use of available malware classification services, namely VirusTotal (VT) engines, for determining the family of an Android sample. Using a dataset of 2.47 M Android malware samples, we find that the detection coverage of VT's AVs is generally very low, that the percentage of samples flagged by any 2 AV engines does not go beyond 52%, and that common families between any pair of AV engines is at best 29%. We rely on clustering to determine the extent to which different AV engine pairs agree upon which samples belong to the same family (regardless of the actual family name) and find that there are discrepancies that can introduce noise in automatic label unification schemes. We also observe the usage of generic labels and inconsistencies within the labels of top AV engines, suggesting that their efforts are directed towards accurate detection rather than classification. Our results contribute to a better understanding of the limitations of using Android malware family labels as supplied by common AV engines.This work has been supported by the “Ramon y Cajal” Fellowship RYC-2020-029401

    GRASE: Granulometry Analysis with Semi Eager Classifier to Detect Malware

    Get PDF
    Technological advancement in communication leading to 5G, motivates everyone to get connected to the internet including ‘Devices’, a technology named Web of Things (WoT). The community benefits from this large-scale network which allows monitoring and controlling of physical devices. But many times, it costs the security as MALicious softWARE (MalWare) developers try to invade the network, as for them, these devices are like a ‘backdoor’ providing them easy ‘entry’. To stop invaders from entering the network, identifying malware and its variants is of great significance for cyberspace. Traditional methods of malware detection like static and dynamic ones, detect the malware but lack against new techniques used by malware developers like obfuscation, polymorphism and encryption. A machine learning approach to detect malware, where the classifier is trained with handcrafted features, is not potent against these techniques and asks for efforts to put in for the feature engineering. The paper proposes a malware classification using a visualization methodology wherein the disassembled malware code is transformed into grey images. It presents the efficacy of Granulometry texture analysis technique for improving malware classification. Furthermore, a Semi Eager (SemiE) classifier, which is a combination of eager learning and lazy learning technique, is used to get robust classification of malware families. The outcome of the experiment is promising since the proposed technique requires less training time to learn the semantics of higher-level malicious behaviours. Identifying the malware (testing phase) is also done faster. A benchmark database like malimg and Microsoft Malware Classification challenge (BIG-2015) has been utilized to analyse the performance of the system. An overall average classification accuracy of 99.03 and 99.11% is achieved, respectively

    MDFRCNN: Malware Detection using Faster Region Proposals Convolution Neural Network

    Get PDF
    Technological advancement of smart devices has opened up a new trend: Internet of Everything (IoE), where all devices are connected to the web. Large scale networking benefits the community by increasing connectivity and giving control of physical devices. On the other hand, there exists an increased ‘Threat’ of an ‘Attack’. Attackers are targeting these devices, as it may provide an easier ‘backdoor entry to the users’ network’.MALicious softWARE (MalWare) is a major threat to user security. Fast and accurate detection of malware attacks are the sine qua non of IoE, where large scale networking is involved. The paper proposes use of a visualization technique where the disassembled malware code is converted into gray images, as well as use of Image Similarity based Statistical Parameters (ISSP) such as Normalized Cross correlation (NCC), Average difference (AD), Maximum difference (MaxD), Singular Structural Similarity Index Module (SSIM), Laplacian Mean Square Error (LMSE), MSE and PSNR. A vector consisting of gray image with statistical parameters is trained using a Faster Region proposals Convolution Neural Network (F-RCNN) classifier. The experiment results are promising as the proposed method includes ISSP with F-RCNN training. Overall training time of learning the semantics of higher-level malicious behaviors is less. Identification of malware (testing phase) is also performed in less time. The fusion of image and statistical parameter enhances system performance with greater accuracy. The benchmark database from Microsoft Malware Classification challenge has been used to analyze system performance, which is available on the Kaggle website. An overall average classification accuracy of 98.12% is achieved by the proposed method

    Advance Android PHAs/Malware Detection Techniques by Utilizing Signature Data, Behavioral Patterns and Machine Learning

    Get PDF
    During the last decade mobile phones and tablets evolved into smart devices with enormous computing power and storage capacity packed in a pocket size. People around the globe have quickly moved from laptops to smartphones for their daily computational needs. From web browsing, social networking, photography to critical bank payments and intellectual property every thing has got into smartphones; and undoubtedly Android has dominated the smartphone market. Android growth also attracted cyber criminals to focus on creating attacks and malwares to target Android users. Malwares in different category are seen in the Android ecosystem, including botnets, Ransomware, click Trojan, SMS frauds, banking Trojans. Due to huge amount of application being developed and distributed every day, Android needs malware analysis techniques that are different than any other operating system. This research focuses on defining a process of finding Android malware in a given large number of new applications. Research utilizes machine learning techniques in predicting possible malware and further provide assistance in reverse engineering of malware. Under this thesis an assistive Android malware analysis system “AndroSandX” is proposed, researched and developed. AndroSandX allows researcher to quickly analyze potential Android malware and help perform manual analysis. Key features of the system are strong assistive capabilities using machine learning, built in ticketing system, highly modular design, storage with non-relational databases, backup of analysis data for archival, assistance in manual analysis and threat intelligence. Research results shows that the system has a prediction accuracy of around 92%. Research has wide scope and lean towards providing industry oriented Android malware analysis assistive system/product

    Machine Learning Methodologies For Low-Level Hardware-Based Malware Detection

    Get PDF
    Malicious software continues to be a pertinent threat to the security of critical infrastructures harboring sensitive information. The abundance in malware samples and the disclosure of newer vulnerability paths for exploitation necessitates intelligent machine learning techniques for effective and efficient malware detection and analysis. Software-based methods are suitable for in-depth forensic analysis, but their on-device implementations are slower and resource hungry. Alternatively, hardware-based approaches are emerging as an alternative approach against malware threats because of their trustworthiness, difficult evasion, and lower implementation costs. Modern processors have numerous hardware events such as power domains, voltage, frequency, accessible through software interfaces for performance monitoring and debugging. But, information leakage from these events are not explored for defenses against malware threats. This thesis demonstrates approach towards malware detection and analysis by leveraging low-level hardware signatures. The proposed research aims to develop machine learning methodology for detecting malware applications, classifying malware family and detecting shellcode exploits from low-level power signatures and electromagnetic emissions. This includes 1) developing a signature based detector by extracting features from DVFS states and using ML model to distinguish malware application from benign. 2) developing ML model operating on frequency and wavelet features to classify malware behaviors using EM emissions. 3) developing an Restricted Boltzmann Machine (RBM) model to detect anomalies in energy telemetry register values of malware infected application resulting from shellcode exploits. The evaluation of the proposed ML methodology on malware datasets indicate architecture-agnostic, pervasive, platform independent detectors that distinguishes malware against benign using DVFS signatures, classifies detected malware to characteristic family using EM signatures, and detect shellcode exploits on browser applications by identifying anomalies in energy telemetry register values using energy-based RBM model.Ph.D
    • 

    corecore