324 research outputs found

    Detection of Android Malware using Feature Selection with a Hybrid Genetic Algorithm and Simulated Annealing (SVM and DBN)

    Get PDF
    Because of the widespread use of the Android operating system and the simplicity with which applications can be created on the Android platform, anyone can easily create malware using pre-made tools. Due to the spread of malware among many helpful applications, Android users are experiencing issues. In this study, we showed how to use permissions gleaned from static analysis to identify Android malware. Utilising support vector machines and deep belief networks, we choose the pertinent features from the set of permissions based on this methodology. The suggested technique increases the effectiveness of Android malware detection

    Selecting Root Exploit Features Using Flying Animal-Inspired Decision

    Get PDF
    Malware is an application that executes malicious activities to a computer system, including mobile devices. Root exploit brings more damages among all types of malware because it is able to run in stealthy mode. It compromises the nucleus of the operating system known as kernel to bypass the Android security mechanisms. Once it attacks and resides in the kernel, it is able to install other possible types of malware to the Android devices. In order to detect root exploit, it is important to investigate its features to assist machine learning to predict it accurately. This study proposes flying animal-inspired (1) bat, 2) firefly, and 3) bee) methods to search automatically the exclusive features, then utilizes these flying animal-inspired decision features to improve the machine learning prediction. Furthermore, a boosting method (Adaboost) boosts the multilayer perceptron (MLP) potential to a stronger classification. The evaluation jotted the best result is from bee search, which recorded 91.48 percent in accuracy, 82.2 percent in true positive rate, and 0.1 percent false positive rate

    Optimizing the Failure Prediction in Deep Learning

    Get PDF
      Avatars are computer-generated digital representations that people may use in the Predicting issues with software systems built from modules is the focus of this research. This data collection was used as a reference in order to accomplish this objective. The evaluation framework for reusable software components is provided by this research. The dataset of factors that play a role in the decision-making process has been run through the PSO algorithm. The primary objective is to provide a clever and time-saving method of choosing components. After filtering for ideal values, the dataset is utilized to train a deep learning model. Accuracy measurements including recall value, precision, and F1 score will be used to evaluate the effectiveness of the optimized component selection model. This research is significant because it provides a high-performance and accurate solution to a major problem in predicting. We have done our best to estimate the number of lines of code, the complexity, the design complexity, the projected time, the difficulty, the intelligence, and the efforts required. A model for discovering mistakes has been developed after the dataset was filtered to account for the ideal value. By keeping just the most crucial characteristics and getting rid of all optimized data, we have made the model more trustworthy. &nbsp

    Intelligent Android malware family classification using Genetic Algorithms and SVM

    Get PDF
    As of April 2019, Android was the most popular mobile operating system amongst smartphone users[1]. Its high popularity, combined with the extended use of smartphones for everyday tasks as well as storing or accessing sensitive and personal data, has made Android applications the target of numerous malware attacks over the last few years and in the present. The malware attacks have been perfected to target specific vulnerabilities in the operating system or the user; thus specializing in types of malware and families within each type. The malware is usually distributed in infected applications (or APKs), which contain malicious behaviours that can be found looking into their code (known as static analysis) or analysing the behaviour of the application while running (known as dynamic analysis). This document describes the implementation of an intelligent system that aims to classify a series of malicious APK samples obtained from the free repository ContagioDump. These samples are classified inside the type and family they belong to. To create the classifier system, a Support Vector Machine (SVM) is implemented using Python’s library Scikit Learn. A series of attributes are extracted from the samples of malicious APK by analysing the code of the APKs via static analysis, using Python’s library Androguard, which contains a parser that allows to interact with all the relevant parts of the APK file. The attributes obtained are very high in number, and for that reason a Genetic Algorithm is used to optimize the attributes that the SVM uses in the learning process. The algorithm codifies a subset of attributes from all the attributes extracted in the static analysis, and is evaluated using the accuracy score obtained when training the SVM with said subset. As a result, a subset of attributes and a trained model for the classification are obtained. This model is then tested with a new set of malware samples, belonging to all the families classified in the learning. The present document contains the explanation of the process of designing, creating and testing the system. It is developed as bachelor’s thesis for computer science and engineering degree in Universidad Carlos III de Madrid.Ingeniería en Tecnologías de Telecomunicación (Plan 2010

    Android Malware Detection System using Genetic Programming

    Get PDF
    Nowadays, smartphones and other mobile devices are playing a significant role in the way people engage in entertainment, communicate, network, work, and bank and shop online. As the number of mobile phones sold has increased dramatically worldwide, so have the security risks faced by the users, to a degree most do not realise. One of the risks is the threat from mobile malware. In this research, we investigate how supervised learning with evolutionary computation can be used to synthesise a system to detect Android mobile phone attacks. The attacks include malware, ransomware and mobile botnets. The datasets used in this research are publicly downloadable, available for use with appropriate acknowledgement. The primary source is Drebin. We also used ransomware and mobile botnet datasets from other Android mobile phone researchers. The research in this thesis uses Genetic Programming (GP) to evolve programs to distinguish malicious and non-malicious applications in Android mobile datasets. It also demonstrates the use of GP and Multi-Objective Evolutionary Algorithms (MOEAs) together to explore functional (detection rate) and non-functional (execution time and power consumption) trade-offs. Our results show that malicious and non-malicious applications can be distinguished effectively using only the permissions held by applications recorded in the application's Android Package (APK). Such a minimalist source of features can serve as the basis for highly efficient Android malware detection. Non-functional tradeoffs are also highlight

    Machine Learning and other Computational-Intelligence Techniques for Security Applications

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Feature Selection on Permissions, Intents and APIs for Android Malware Detection

    Get PDF
    Malicious applications pose an enormous security threat to mobile computing devices. Currently 85% of all smartphones run Android, Google’s open-source operating system, making that platform the primary threat vector for malware attacks. Android is a platform that hosts roughly 99% of known malware to date, and is the focus of most research efforts in mobile malware detection due to its open source nature. One of the main tools used in this effort is supervised machine learning. While a decade of work has made a lot of progress in detection accuracy, there is an obstacle that each stream of research is forced to overcome, feature selection, i.e., determining which attributes of Android are most effective as inputs into machine learning models. This dissertation aims to address that problem by providing the community with an exhaustive analysis of the three primary types of Android features used by researchers: Permissions, Intents and API Calls. The intent of the report is not to describe a best performing feature set or a best performing machine learning model, nor to explain why certain Permissions, Intents or API Calls get selected above others, but rather to provide a holistic methodology to help guide feature selection for Android malware detection. The experiments used eleven different feature selection techniques covering filter methods, wrapper methods and embedded methods. Each feature selection technique was applied to seven different datasets based on the seven combinations available of Permissions, Intents and API Calls. Each of those seven datasets are from a base set of 119k Android apps. All of the result sets were then validated against three different machine learning models, Random Forest, SVM and a Neural Net, to test applicability across algorithm type. The experiments show that using a combination of Permissions, Intents and API Calls produced higher accuracy than using any of those alone or in any other combination and that feature selection should be performed on the combined dataset, not by feature type and then combined. The data also shows that, in general, a feature set size of 200 or more attributes is required for optimal results. Finally, the feature selection methods Relief, Correlation-based Feature Selection (CFS) and Recursive Feature Elimination (RFE) using a Neural Net are not satisfactory approaches for Android malware detection work. Based on the proposed methodology and experiments, this research provided insights into feature selection – a significant but often overlooked issue in Android malware detection. We believe the results reported herein is an important step for effective feature evaluation and selection in assisting malware detection especially for datasets with a large number of features. The methodology also has the potential to be applied to similar malware detection tasks or even in broader domains such as pattern recognition

    A systematic literature review on source code similarity measurement and clone detection: techniques, applications, and challenges

    Full text link
    Measuring and evaluating source code similarity is a fundamental software engineering activity that embraces a broad range of applications, including but not limited to code recommendation, duplicate code, plagiarism, malware, and smell detection. This paper proposes a systematic literature review and meta-analysis on code similarity measurement and evaluation techniques to shed light on the existing approaches and their characteristics in different applications. We initially found over 10000 articles by querying four digital libraries and ended up with 136 primary studies in the field. The studies were classified according to their methodology, programming languages, datasets, tools, and applications. A deep investigation reveals 80 software tools, working with eight different techniques on five application domains. Nearly 49% of the tools work on Java programs and 37% support C and C++, while there is no support for many programming languages. A noteworthy point was the existence of 12 datasets related to source code similarity measurement and duplicate codes, of which only eight datasets were publicly accessible. The lack of reliable datasets, empirical evaluations, hybrid methods, and focuses on multi-paradigm languages are the main challenges in the field. Emerging applications of code similarity measurement concentrate on the development phase in addition to the maintenance.Comment: 49 pages, 10 figures, 6 table

    Applications in security and evasions in machine learning : a survey

    Get PDF
    In recent years, machine learning (ML) has become an important part to yield security and privacy in various applications. ML is used to address serious issues such as real-time attack detection, data leakage vulnerability assessments and many more. ML extensively supports the demanding requirements of the current scenario of security and privacy across a range of areas such as real-time decision-making, big data processing, reduced cycle time for learning, cost-efficiency and error-free processing. Therefore, in this paper, we review the state of the art approaches where ML is applicable more effectively to fulfill current real-world requirements in security. We examine different security applications' perspectives where ML models play an essential role and compare, with different possible dimensions, their accuracy results. By analyzing ML algorithms in security application it provides a blueprint for an interdisciplinary research area. Even with the use of current sophisticated technology and tools, attackers can evade the ML models by committing adversarial attacks. Therefore, requirements rise to assess the vulnerability in the ML models to cope up with the adversarial attacks at the time of development. Accordingly, as a supplement to this point, we also analyze the different types of adversarial attacks on the ML models. To give proper visualization of security properties, we have represented the threat model and defense strategies against adversarial attack methods. Moreover, we illustrate the adversarial attacks based on the attackers' knowledge about the model and addressed the point of the model at which possible attacks may be committed. Finally, we also investigate different types of properties of the adversarial attacks

    Neural malware detection

    Get PDF
    At the heart of today’s malware problem lies theoretically infinite diversity created by metamorphism. The majority of conventional machine learning techniques tackle the problem with the assumptions that a sufficiently large number of training samples exist and that the training set is independent and identically distributed. However, the lack of semantic features combined with the models under these wrong assumptions result largely in overfitting with many false positives against real world samples, resulting in systems being left vulnerable to various adversarial attacks. A key observation is that modern malware authors write a script that automatically generates an arbitrarily large number of diverse samples that share similar characteristics in program logic, which is a very cost-effective way to evade detection with minimum effort. Given that many malware campaigns follow this paradigm of economic malware manufacturing model, the samples within a campaign are likely to share coherent semantic characteristics. This opens up a possibility of one-to-many detection. Therefore, it is crucial to capture this non-linear metamorphic pattern unique to the campaign in order to detect these seemingly diverse but identically rooted variants. To address these issues, this dissertation proposes novel deep learning models, including generative static malware outbreak detection model, generative dynamic malware detection model using spatio-temporal isomorphic dynamic features, and instruction cognitive malware detection. A comparative study on metamorphic threats is also conducted as part of the thesis. Generative adversarial autoencoder (AAE) over convolutional network with global average pooling is introduced as a fundamental deep learning framework for malware detection, which captures highly complex non-linear metamorphism through translation invariancy and local variation insensitivity. Generative Adversarial Network (GAN) used as a part of the framework enables oneshot training where semantically isomorphic malware campaigns are identified by a single malware instance sampled from the very initial outbreak. This is a major innovation because, to the best of our knowledge, no approach has been found to this challenging training objective against the malware distribution that consists of a large number of very sparse groups artificially driven by arms race between attackers and defenders. In addition, we propose a novel method that extracts instruction cognitive representation from uninterpreted raw binary executables, which can be used for oneto- many malware detection via one-shot training against frequency spectrum of the Transformer’s encoded latent representation. The method works regardless of the presence of diverse malware variations while remaining resilient to adversarial attacks that mostly use random perturbation against raw binaries. Comprehensive performance analyses including mathematical formulations and experimental evaluations are provided, with the proposed deep learning framework for malware detection exhibiting a superior performance over conventional machine learning methods. The methods proposed in this thesis are applicable to a variety of threat environments here artificially formed sparse distributions arise at the cyber battle fronts.Doctor of Philosoph
    • …
    corecore