303 research outputs found

    Permission based Mobile Malware Detection System using Machine Learning Techniques

    Get PDF
    Mobile technology has grown dramatically around the world. Nowadays smart mobile devices are ubiquitous, i.e. they serve multiple purposes such as personal mobile communication, data storage, multimedia and entertainment etc. They have become important part of life. Implementing secure mobile and wireless networks is crucial for enterprises operating in the Internet-based business environment. Mobile market share has grown significantly in past few years so that we need to think about mobile security. Mobile security can be compromised due to design flaws, vulnerabilities, and protocol failures in any mobile applications, viruses, spyware, malware and other threats. In this paper we will more focus on mobile malware. Many tools are available in the market to detect malware but new research trend in the mobile security is users should be aware of app before he/she installs from the app store. Hence we propose a novel approach for permission based mobile malware detection system. It is based on static analysis. It has 3 major parts in it 1) a signature database for storing analysis results of training and testing. 2) An Android client who is used by end users for making analysis requests, and 3) a central server plays important role as it communicates with both signature database and smartphone client. We can say that he is the manager of whole analysis process. It alerts user if the app is malicious or the benign based on it user can proceed whether to continue with it or not

    Analysis and evaluation of SafeDroid v2.0, a framework for detecting malicious Android applications

    Get PDF
    Android smartphones have become a vital component of the daily routine of millions of people, running a plethora of applications available in the official and alternative marketplaces. Although there are many security mechanisms to scan and filter malicious applications, malware is still able to reach the devices of many end-users. In this paper, we introduce the SafeDroid v2.0 framework, that is a flexible, robust, and versatile open-source solution for statically analysing Android applications, based on machine learning techniques. The main goal of our work, besides the automated production of fully sufficient prediction and classification models in terms of maximum accuracy scores and minimum negative errors, is to offer an out-of-the-box framework that can be employed by the Android security researchers to efficiently experiment to find effective solutions: the SafeDroid v2.0 framework makes it possible to test many different combinations of machine learning classifiers, with a high degree of freedom and flexibility in the choice of features to consider, such as dataset balance and dataset selection. The framework also provides a server, for generating experiment reports, and an Android application, for the verification of the produced models in real-life scenarios. An extensive campaign of experiments is also presented to show how it is possible to efficiently find competitive solutions: the results of our experiments confirm that SafeDroid v2.0 can reach very good performances, even with highly unbalanced dataset inputs and always with a very limited overhead

    Detecting Repackaged Android Applications Using Perceptual Hashing

    Get PDF
    The last decade has shown a steady rate of Android device dominance in market share and the emergence of hundreds of thousands of apps available to the public. Because of the ease of reverse engineering Android applications, repackaged malicious apps that clone existing code have become a severe problem in the marketplace. This research proposes a novel repackaged detection system based on perceptual hashes of vetted Android apps and their associated dynamic user interface (UI) behavior. Results show that an average hash approach produces 88% accuracy (indicating low false negative and false positive rates) in a sample set of 4878 Android apps, including 2151 repackaged apps. The approach is the first dynamic method proposed in the research community using image-based hashing techniques with reasonable performance to other known dynamic approaches and the possibility for practical implementation at scale for new applications entering the Android market

    Um método supervisionado para encontrar variáveis discriminantes na análise de problemas complexos : estudos de caso em segurança do Android e em atribuição de impressora fonte

    Get PDF
    Orientadores: Ricardo Dahab, Anderson de Rezende RochaDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: A solução de problemas onde muitos componentes atuam e interagem simultaneamente requer modelos de representação nem sempre tratáveis pelos métodos analíticos tradicionais. Embora em muitos caso se possa prever o resultado com excelente precisão através de algoritmos de aprendizagem de máquina, a interpretação do fenómeno requer o entendimento de quais são e em que proporção atuam as variáveis mais importantes do processo. Esta dissertação apresenta a aplicação de um método onde as variáveis discriminantes são identificadas através de um processo iterativo de ranqueamento ("ranking") por eliminação das que menos contribuem para o resultado, avaliando-se em cada etapa o impacto da redução de características nas métricas de acerto. O algoritmo de florestas de decisão ("Random Forest") é utilizado para a classificação e sua propriedade de importância das características ("Feature Importance") para o ranqueamento. Para a validação do método, dois trabalhos abordando sistemas complexos de natureza diferente foram realizados dando origem aos artigos aqui apresentados. O primeiro versa sobre a análise das relações entre programas maliciosos ("malware") e os recursos requisitados pelos mesmos dentro de um ecossistema de aplicações no sistema operacional Android. Para realizar esse estudo, foram capturados dados, estruturados segundo uma ontologia definida no próprio artigo (OntoPermEco), de 4.570 aplicações (2.150 malware, 2.420 benignas). O modelo complexo produziu um grafo com cerca de 55.000 nós e 120.000 arestas, o qual foi transformado usando-se a técnica de bolsa de grafos ("Bag Of Graphs") em vetores de características de cada aplicação com 8.950 elementos. Utilizando-se apenas os dados do manifesto atingiu-se com esse modelo 88% de acurácia e 91% de precisão na previsão do comportamento malicioso ou não de uma aplicação, e o método proposto foi capaz de identificar 24 características relevantes na classificação e identificação de famílias de malwares, correspondendo a 70 nós no grafo do ecosistema. O segundo artigo versa sobre a identificação de regiões em um documento impresso que contém informações relevantes na atribuição da impressora laser que o imprimiu. O método de identificação de variáveis discriminantes foi aplicado sobre vetores obtidos a partir do uso do descritor de texturas (CTGF-"Convolutional Texture Gradient Filter") sobre a imagem scaneada em 600 DPI de 1.200 documentos impressos em 10 impressoras. A acurácia e precisão médias obtidas no processo de atribuição foram de 95,6% e 93,9% respectivamente. Após a atribuição da impressora origem a cada documento, 8 das 10 impressoras permitiram a identificação de variáveis discriminantes associadas univocamente a cada uma delas, podendo-se então visualizar na imagem do documento as regiões de interesse para uma análise pericial. Os objetivos propostos foram atingidos mostrando-se a eficácia do método proposto na análise de dois problemas em áreas diferentes (segurança de aplicações e forense digital) com modelos complexos e estruturas de representação bastante diferentes, obtendo-se um modelo reduzido interpretável para ambas as situaçõesAbstract: Solving a problem where many components interact and affect results simultaneously requires models which sometimes are not treatable by traditional analytic methods. Although in many cases the result is predicted with excellent accuracy through machine learning algorithms, the interpretation of the phenomenon requires the understanding of how the most relevant variables contribute to the results. This dissertation presents an applied method where the discriminant variables are identified through an iterative ranking process. In each iteration, a classifier is trained and validated discarding variables that least contribute to the result and evaluating in each stage the impact of this reduction in the classification metrics. Classification uses the Random Forest algorithm, and the discarding decision applies using its feature importance property. The method handled two works approaching complex systems of different nature giving rise to the articles presented here. The first article deals with the analysis of the relations between \textit{malware} and the operating system resources requested by them within an ecosystem of Android applications. Data structured according to an ontology defined in the article (OntoPermEco) were captured to carry out this study from 4,570 applications (2,150 malware, 2,420 benign). The complex model produced a graph of about 55,000 nodes and 120,000 edges, which was transformed using the Bag of Graphs technique into feature vectors of each application with 8,950 elements. The work accomplished 88% of accuracy and 91% of precision in predicting malicious behavior (or not) for an application using only the data available in the application¿s manifest, and the proposed method was able to identify 24 relevant features corresponding to only 70 nodes of the entire ecosystem graph. The second article is about to identify regions in a printed document that contains information relevant to the attribution of the laser printer that printed it. The discriminant variable determination method achieved average accuracy and precision of 95.6% and 93.9% respectively in the source printer attribution using a dataset of 1,200 documents printed on ten printers. Feature vectors were obtained from the scanned image at 600 DPI applying the texture descriptor Convolutional Texture Gradient Filter (CTGF). After the assignment of the source printer to each document, eight of the ten printers allowed the identification of discriminant variables univocally associated to each one of them, and it was possible to visualize in document's image the regions of interest for expert analysis. The work in both articles accomplished the objective of reducing a complex system into an interpretable streamlined model demonstrating the effectiveness of the proposed method in the analysis of two problems in different areas (application security and digital forensics) with complex models and entirely different representation structuresMestradoCiência da ComputaçãoMestre em Ciência da Computaçã

    Selecting Root Exploit Features Using Flying Animal-Inspired Decision

    Get PDF
    Malware is an application that executes malicious activities to a computer system, including mobile devices. Root exploit brings more damages among all types of malware because it is able to run in stealthy mode. It compromises the nucleus of the operating system known as kernel to bypass the Android security mechanisms. Once it attacks and resides in the kernel, it is able to install other possible types of malware to the Android devices. In order to detect root exploit, it is important to investigate its features to assist machine learning to predict it accurately. This study proposes flying animal-inspired (1) bat, 2) firefly, and 3) bee) methods to search automatically the exclusive features, then utilizes these flying animal-inspired decision features to improve the machine learning prediction. Furthermore, a boosting method (Adaboost) boosts the multilayer perceptron (MLP) potential to a stronger classification. The evaluation jotted the best result is from bee search, which recorded 91.48 percent in accuracy, 82.2 percent in true positive rate, and 0.1 percent false positive rate

    IoT Health Devices: Exploring Security Risks in the Connected Landscape

    Get PDF
    The concept of the Internet of Things (IoT) spans decades, and the same can be said for its inclusion in healthcare. The IoT is an attractive target in medicine; it offers considerable potential in expanding care. However, the application of the IoT in healthcare is fraught with an array of challenges, and also, through it, numerous vulnerabilities that translate to wider attack surfaces and deeper degrees of damage possible to both consumers and their confidence within health systems, as a result of patient-specific data being available to access. Further, when IoT health devices (IoTHDs) are developed, a diverse range of attacks are possible. To understand the risks in this new landscape, it is important to understand the architecture of IoTHDs, operations, and the social dynamics that may govern their interactions. This paper aims to document and create a map regarding IoTHDs, lay the groundwork for better understanding security risks in emerging IoTHD modalities through a multi-layer approach, and suggest means for improved governance and interaction. We also discuss technological innovations expected to set the stage for novel exploits leading into the middle and latter parts of the 21st century

    Signal processing for malware analysis

    Get PDF
    This Project is an experimental analysis of Android malware through images. The analysis is based on classifying the malware into families or differentiating between goodware and malware. This analysis has been done considering two approaches. These two approaches have a common starting point, which is the transformation of Android applications into PNG images. After this conversion, the first approach was subtracting each image from the testing set with the images of the training set, in order to establish which unknown malware belongs to a specific family or to distinguish between goodware and malware. Although the accuracy was higher than the one defined in the requirements, this approach was a time consuming task, so we consider another approach to reduce the time and get the same or better accuracy. The second approach was extracting features from all the images and then using a machine learning classifier to get a precise differentiation. After this second approach, the resulting time for 100,000 samples was less than 4 hours and the accuracy 83.04%, which fulfill the requirements specified. To perform the analysis, we have used two heterogeneous datasets. The Malgenome dataset which contains 49 kinds of malware Android applications (49 malware families). It was used to perform the measurements and the different tests. The M0droid dataset, which contains goodware and malware Android applications. It was used to corroborate the previous analysis.Este proyecto es un análisis experimental de aplicaciones de Android mediante imágenes. Este análisis se basa en clasificar las imágenes en familias o en diferenciarlas entre goodware o malware. Para ello, se han considerado dos enfoques. Estas dos aproximaciones tienen como punto en común la transformación de las aplicaciones de Android en imágenes de tipo PNG. Después de este proceso de transformación a imágenes, la primera aproximación se basó en restar cada imagen perteneciente al grupo de pruebas con las imágenes del grupo de entrenamiento, de esta forma se pudo saber la familia a la que pertenecía cada malware desconocido o distinguir entre aplicaciones goodware y malware. Sin embargo, a pesar de que la precisión de acierto era más alta que la definida en los requisitos, este enfoque era una tarea que consumía mucho tiempo, así que consideramos otra aproximación para reducir el tiempo y conseguir una precisión parecida o mejor que la anterior. Este segundo enfoque fue extraer las características de las imágenes para después usar un clasificador y así obtener una diferenciación precisa. Con esta segunda aproximación, conseguimos un tiempo total menor a las 4 horas para 100000 muestras con una precisión del 83.04%, cumpliendo y superando de esta forma los requisitos que habían sido especificados. Este análisis se ha llevado a cabo usando dos sets de datos heterogéneos. Uno de ellos fue el perteneciente a un proyecto llamado Malgenome, éste contiene 49 tipos de familias de malware en Android. El set de datos de Malgenome se usó para realizar los diferentes ensayos o pruebas y sobre el que se realizaron las medidas de tiempo y precisión. El set de datos de M0droid se usó para corroborar el análisis previo y así establecer una clasificación final.Ingeniería Informátic

    Malware detection using static analysis in android: A review of FeCO (features, classification, and obfuscation)

    Get PDF
    Android is a free open-source operating system (OS), which allows an in-depth understanding of its architecture. Therefore, many manufacturers are utilizing this OS to produce mobile devices (smartphones, smartwatch, and smart glasses) in different brands, including Google Pixel, Motorola, Samsung, and Sony. Notably, the employment of OS leads to a rapid increase in the number of Android users. However, unethical authors tend to develop malware in the devices for wealth, fame, or private purposes. Although practitioners conduct intrusion detection analyses, such as static analysis, there is an inadequate number of review articles discussing the research efforts on this type of analysis. Therefore, this study discusses the articles published from 2009 until 2019 and analyses the steps in the static analysis (reverse engineer, features, and classification) with taxonomy. Following that, the research issue in static analysis is also highlighted. Overall, this study serves as the guidance for novice security practitioners and expert researchers in the proposal of novel research to detect malware through static analysis
    corecore