5 research outputs found

    Key features for the characterization of Android malware families

    Get PDF
    In recent years, mobile devices such as smartphones, tablets and wearables have become the new paradigm of user–computer interaction. The increasing use and adoption of such devices is also leading to an increased number of potential security risks. The spread of mobile malware, particularly on popular and open platforms such as Android, has become a major concern. This paper focuses on the bad-intentioned Android apps by addressing the problem of selecting the key features of such software that support the characterization of such malware. The accurate detection and characterization of this software is still an open challenge, mainly due to its ever-changing nature and the open distribution channels of Android apps. Maximum relevance minimum redundancy and evolutionary algorithms guided by information correlation measures have been applied for feature selection on the well-known Android Malware Genome (Malgenome) dataset, attaining interesting results on the most informative features for the characterization of representative families of existing Android malware.This research has been partially supported through the project of the Spanish Ministry of Economy and Competitiveness RTC-2014-3059-4. The authors would also like to thank the BIO/BU01/15 and the Spanish Ministry of Science and Innovation PID 560300-2009-11

    A Type-Based Blocking Technique for Efficient Entity Resolution over Large-Scale Data

    Get PDF
    In data integration, entity resolution is an important technique to improve data quality. Existing researches typically assume that the target dataset only contain string-type data and use single similarity metric. For larger high-dimensional dataset, redundant information needs to be verified using traditional blocking or windowing techniques. In this work, we propose a novel ER-resolving method using a hybrid approach, including type-based multiblocks, varying window size, and more flexible similarity metrics. In our new ER workflow, we reduce the searching space for entity pairs by the constraint of redundant attributes and matching likelihood. We develop a reference implementation of our proposed approach and validate its performance using real-life dataset from one Internet of Things project. We evaluate the data processing system using five standard metrics including effectiveness, efficiency, accuracy, recall, and precision. Experimental results indicate that the proposed approach could be a promising alternative for entity resolution and could be feasibly applied in real-world data cleaning for large datasets

    Proposed Framework to Improving Performance of Familial Classification in Android Malware

    Get PDF
    Because of the recent developments in hardware and software technologies for mobile phones, people depend on their smartphones more than ever before. Today, people conduct a variety of business, health, and financial transactions on their mobile devices. This trend has caused an influx of mobile applications that require users' sensitive information. As these applications increase so too have the number of malicious applications increased, which may compromise users' sensitive information. Between all smartphone, Android receives major attention from security practitioners and researchers due to the large number of malicious applications. For the past twelve years, Android malicious applications have been clustered into groups for better identification. Characterizing the malware families can improve the detection process and understand the malware patterns. However, in the research community, detecting new malware families is a challenge. In this research, a framework is proposed to improve the performance of familial classification in Android malware. The framework is named a Reverse Engineering Framework (RevEng). Within RevEng, applications' permissions were selected and then fed into machine learning algorithms. Through our research, we created a reduced set of permissions using Extremely Randomized Trees algorithm that achieved high accuracy and a shorter execution time. Furthermore, we conducted two approaches based on the extracted information. The first approach used a binary value representation of the permissions. The second approach used the features' importance. We represented each selected permission in latter approach by its weight value instead of its binary value in the former approach. We conducted a comparison between the results of our two approaches and other relevant works. Our approaches achieved better results in both accuracy and time performance with a reduced number of permissions

    Análisis y detección de ataques informáticos mediante sistemas inteligentes de reducción dimensional

    Get PDF
    Programa Oficial de Doutoramento en Enerxía e Propulsión Mariña. 5014P01[Resumen] El presente trabajo de investigación aborda el estudio y desarrollo de una metodología para la detección de ataques informáticos mediante el uso de sistemas y técnicas inteligentes de reducción dimensional en el ámbito de la ciberseguridad. Con esta propuesta se pretende dividir el problema en dos fases. La primera consiste en un reducción dimensional del espacio de entrada original, proyectando los datos sobre un espacio de salida de menor dimensión mediante transformaciones lineales y/o no lineales que permiten obtener una mejor visualización de la estructura interna del conjunto de datos. En la segunda fase se introduce el conocimiento de un experto humano que permite aportar su conocimiento mediante el etiquetado de las muestras en base a las proyecciones obtenidas y su experiencia sobre el problema. Esta novedosa propuesta pone a disposición del usuario final una herramienta sencilla y proporciona unos resultados intuitivos y fácilmente interpretables, permitiendo hacer frente a nuevas amenazas a las que el usuario no se haya visto expuesto, obteniendo resultados altamente satisfactorios en todos los casos reales en los que se ha aplicado. El sistema desarrollado ha sido validado sobre tres supuestos reales diferentes, en los que se ha avanzado en términos de conocimiento con un claro hilo conductor de progreso positivo de la propuesta. En el primero de los casos se efectúa un análisis de un conocido conjunto de datos de malware de Android en el que, mediante técnicas clásicas de reducción dimensional, se efectúa una caracterización de las diversas familias de malware. Para la segunda de las propuestas se trabaja sobre el mismo conjunto de datos, pero en este caso se aplican técnicas más avanzadas e incipientes de reducción dimensional y visualización, consiguiendo que los resultados se mejoren significativamente. En el último de los trabajos se aprovecha el conocimiento de los dos trabajos previos, y se aplica a la detección de intrusión en sistemas informáticos sobre datos de redes, en las que se producen ataques de diversa índole durante procesos de funcionamiento normal de la red.[Abstract] This research work addresses the study and development of a methodology for the detection of computer attacks using intelligent systems and techniques for dimensional reduction in the eld of cybersecurity. This proposal is intended to divide the problem into two phases. The rst consists of a dimensional reduction of the original input space, projecting the data onto a lower-dimensional output space using linear or non-linear transformations that allow a better visualization of the internal structure of the dataset. In the second phase, the experience of an human expert is presented, which makes it possible to contribute his knowledge by labeling the samples based on the projections obtained and his experience on the problem. This innovative proposal makes a simple tool available to the end user and provides intuitive and easily interpretable results, allowing to face new threats to which the user has not been exposed, obtaining highly satisfactory results in all real cases in which has been applied. The developed system has been validated on three di erent real case studies, in which progress has been made in terms of knowledge with a clear guiding thread of positive progress of the proposal. In the rst case, an analysis of a well-known Android malware dataset is carried out, in which a characterization of the various families of malware is developed using classical dimensional reduction techniques. For the second of the proposals, it has been worked on the same data set, but in this case more advanced and incipient techniques of dimensional reduction and visualization are applied, achieving a signi cant improvement in the results. The last work takes advantage of the knowledge of the two previous works, which is applied to the detection of intrusion in computer systems on network dataset, in which attacks of di erent kinds occur during normal network operation processes.[Resumo] Este traballo de investigación aborda o estudo e desenvolvemento dunha metodoloxía para a detección de ataques informáticos mediante o uso de sistemas e técnicas intelixentes de reducción dimensional no ámbito da ciberseguridade. Esta proposta pretende dividir o problema en dúas fases. A primeira consiste nunha redución dimensional do espazo de entrada orixinal, proxectando os datos nun espazo de saída de menor dimensionalidade mediante transformacións lineais ou non lineais que permitan unha mellor visualización da estrutura interna do conxunto de datos. Na segunda fase, introdúcese a experiencia dun experto humano, que lle permite achegar os seus coñecementos etiquetando as mostras en función das proxeccións obtidas e da súa experiencia sobre o problema. Esta proposta innovadora pon a disposición do usuario nal unha ferramenta sinxela e proporciona resultados intuitivos e facilmente interpretables, que permiten facer fronte a novas ameazas ás que o usuario non estivo exposto, obtendo resultados altamente satisfactorios en todos os casos reais nos que se aplicou. O sistema desenvolvido validouse sobre tres supostos reais diferentes, nos que se avanzou en canto ao coñecemento cun claro fío condutor de avance positivo da proposta. No primeiro caso, realízase unha análise dun coñecido conxunto de datos de malware Android, no que se realiza unha caracterización das distintas familias de malware mediante técnicas clásicas de reducción dimensional. Para a segunda das propostas trabállase sobre o mesmo conxunto de datos, pero neste caso aplícanse técnicas máis avanzadas e incipientes de reducción dimensional e visualización, conseguindo que os resultados se melloren notablemente. O último dos traballos aproveita o coñecemento dos dous traballos anteriores, e aplícase á detección de intrusos en sistemas informáticos en datos da rede, nos que se producen ataques de diversa índole durante os procesos normais de funcionamento da rede
    corecore