167 research outputs found

    Image malware detection using deep learning

    Get PDF
    We are currently living in an area where artificial intelligence is making out every day to day life much easier to manage. Some researchers are continuously developing the codes of artificial intelligence to utilize the benefits of the human being. And there is the process called data mining, which is used in many domains, including finance, engineering, biomedicine, and cyber security. The utilization of data mining, artificial intelligence algorithms like deep learning is so vast that we can't even name them all. This technology has almost touched every industry and cyber security is the most beneficial. The process of enhancing cyber security with the help of deep learning methods has come out of the theory books and many organizations are utilizing them rather than using a traditional piece of software to defend against online threats. Especially in the field of recognizing and classifying codes or malware. And this is essential, because, with the advent of cloud computing and the Internet of Things, expand potential malware infection sites from PCs to any electronic device. This makes our day to day life very unsafe. In this post, first, we will describe in brief how deep learning can be the most useful and promising techniques to detect malware. Besides this we will go through a deep neural network,ResNet for malware dynamic behavior classification jobs

    Machine Learning and other Computational-Intelligence Techniques for Security Applications

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Aprimorando a segurança do Android através de detecção de malware e geração automática de políticas

    Get PDF
    Orientadores: Paulo Lício de Geus, André Ricardo Abed GrégioTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Dispositivos móveis têm evoluído constantemente, recebendo novas funcionalidades e se tornando cada vez mais ubíquos. Assim, eles se tornaram alvos lucrativos para criminosos. Como Android é a plataforma líder em dispositivos móveis, ele se tornou o alvo principal de desenvolvedores de malware. Além disso, a quantidade de apps maliciosas encontradas por empresas de segurança que têm esse sistema operacional como alvo cresceu rapidamente nos últimos anos. Esta tese aborda o problema da segurança de tais dispositivos por dois lados: (i) analisando e identificando apps maliciosas e (ii) desenvolvendo uma política de segurança que pode restringir a superfície de ataque disponível para código nativo. Para tanto, foi desenvolvido um sistema para analisar apps dinamicamente, monitorando chamadas de API e chamadas de sistema. Destes traços de comportamento extraiu-se atributos, que são utilizados por um algoritmo de aprendizado de máquina para classificar apps como maliciosas ou benignas. Um dos problemas principais de sistemas de análise dinâmica é que eles possuem muitas diferenças em relação a dispositivos reais, e exemplares de malware podem usar essas características para identificar se estão sendo analisados, impedindo assim que as ações maliciosas sejam observadas. Para identificar apps maliciosas de Android que evadem análises, desenvolveu-se uma técnica que compara o comportamento de uma app em um dispositivo real e em um emulador. Identificou-se as ações que foram executadas apenas no sistema real e se a divergência foi causada por caminhos de código diferentes serem explorados ou por algum erro não relacionado. Por fim, realizou-se uma análise em larga escala de apps que utilizam código nativo, a fim de se identificar como este é usado por apps legítimas e também para se criar uma política de segurança que restrinja as ações de malware que usam este tipo de códigoAbstract: Mobile devices have been constantly evolving, receiving new functionalities and becoming increasingly ubiquitous. Thus, they became lucrative targets for miscreants. Since Android is the leading platform for mobile devices, it became the most popular choice for malware developers. Moreover, the amount of malicious apps, found by security companies, that target this platform rapidly increased in the last few years. This thesis approaches the security problem of such devices in two ways: (i) by analyzing and identifying malicious apps, and (ii) by developing a sandboxing policy that can restrict the attack surface available to native code. A system was developed to dynamically analyze apps, monitoring API calls and system calls. From these behavior traces attributes were extracted, which are used by a machine learning algorithm to classify apps as malicious or benign. One of the main problems of dynamic analysis systems is that they have many differences compared to real devices, and malware can leverage these characteristics to identify whether they are being analyzed or not, thus being able to prevent the malicious actions from being observed. To identify Android malware that evades analyses, a technique was developed to compare the behavior of an app on a real device and on an emulator. Actions that were only executed in the bare metal system were identified, recognizing whether the divergence was caused by different code paths being explored or by some unrelated error. Finally, a large-scale analysis of apps that use native code was performed, in order to identify how native code is used by benign apps and also to generate a sandboxing policy to restrict malware that use such codeDoutoradoCiência da ComputaçãoDoutor em Ciência da Computação23038.007604/2014-69, 12269/13-1CAPE

    Cyber Security and Critical Infrastructures

    Get PDF
    This book contains the manuscripts that were accepted for publication in the MDPI Special Topic "Cyber Security and Critical Infrastructure" after a rigorous peer-review process. Authors from academia, government and industry contributed their innovative solutions, consistent with the interdisciplinary nature of cybersecurity. The book contains 16 articles: an editorial explaining current challenges, innovative solutions, real-world experiences including critical infrastructure, 15 original papers that present state-of-the-art innovative solutions to attacks on critical systems, and a review of cloud, edge computing, and fog's security and privacy issues

    Effiziente und erklärbare Erkennung von mobiler Schadsoftware mittels maschineller Lernmethoden

    Get PDF
    In recent years, mobile devices shipped with Google’s Android operating system have become ubiquitous. Due to their popularity and the high concentration of sensitive user data on these devices, however, they have also become a profitable target of malware authors. As a result, thousands of new malware instances targeting Android are found almost every day. Unfortunately, common signature-based methods often fail to detect these applications, as these methods can- not keep pace with the rapid development of new malware. Consequently, there is an urgent need for new malware detection methods to tackle this growing threat. In this thesis, we address the problem by combining concepts of static analysis and machine learning, such that mobile malware can be detected directly on the mobile device with low run-time overhead. To this end, we first discuss our analysis results of a sophisticated malware that uses an ultrasonic side channel to spy on unwitting smartphone users. Based on the insights we gain throughout this thesis, we gradually develop a method that allows detecting Android malware in general. The resulting method performs a broad static analysis, gathering a large number of features associated with an application. These features are embedded in a joint vector space, where typical patterns indicative of malware can be automatically identified and used for explaining the decisions of our method. In addition to an evaluation of its overall detection and run-time performance, we also examine the interpretability of the underlying detection model and strengthen the classifier against realistic evasion attacks. In a large set of experiments, we show that the method clearly outperforms several related approaches, including popular anti-virus scanners. In most experiments, our approach detects more than 90% of all malicious samples in the dataset at a low false positive rate of only 1%. Furthermore, even on older devices, it offers a good run-time performance, and can output a decision along with a proper explanation within a few seconds, despite the use of machine learning techniques directly on the mobile device. Overall, we find that the application of machine learning techniques is a promising research direction to improve the security of mobile devices. While these techniques alone cannot defeat the threat of mobile malware, they at least raise the bar for malicious actors significantly, especially if combined with existing techniques.Die Verbreitung von Smartphones, insbesondere mit dem Android-Betriebssystem, hat in den vergangenen Jahren stark zugenommen. Aufgrund ihrer hohen Popularität haben sich diese Geräte jedoch zugleich auch zu einem lukrativen Ziel für Entwickler von Schadsoftware entwickelt, weshalb mittlerweile täglich neue Schadprogramme für Android gefunden werden. Obwohl verschiedene Lösungen existieren, die Schadprogramme auch auf mobilen Endgeräten identifizieren sollen, bieten diese in der Praxis häufig keinen ausreichenden Schutz. Dies liegt vor allem daran, dass diese Verfahren zumeist signaturbasiert arbeiten und somit schädliche Programme erst zuverlässig identifizieren können, sobald entsprechende Erkennungssignaturen vorhanden sind. Jedoch wird es für Antiviren-Hersteller immer schwieriger, die zur Erkennung notwendigen Signaturen rechtzeitig bereitzustellen. Daher ist die Entwicklung von neuen Verfahren nötig, um der wachsenden Bedrohung durch mobile Schadsoftware besser begegnen zu können. In dieser Dissertation wird ein Verfahren vorgestellt und eingehend untersucht, das Techniken der statischen Code-Analyse mit Methoden des maschinellen Lernens kombiniert, um so eine zuverlässige Erkennung von mobiler Schadsoftware direkt auf dem Mobilgerät zu ermöglichen. Die Methode analysiert hierfür mobile Anwendungen zunächst statisch und extrahiert dabei spezielle Merkmale, die eine Abbildung einer Applikation in einen hochdimensionalen Vektorraum ermöglichen. In diesem Vektorraum sind schließlich maschinelle Lernmethoden in der Lage, automatisch Muster zur Erkennung von Schadprogrammen zu finden. Die gefundenen Muster können dabei nicht nur zur Erkennung, sondern darüber hinaus auch zur Erklärung einer getroffenenen Entscheidung dienen. Im Rahmen einer ausführlichen Evaluation wird nicht nur die Erkennungsleistung und die Laufzeit der vorgestellten Methode untersucht, sondern darüber hinaus das gelernte Erkennungsmodell im Detail analysiert. Hierbei wird auch die Robustheit des Modells gegenüber gezielten Angriffe untersucht und verbessert. In einer Reihe von Experimenten kann gezeigt werden, dass mit dem vorgeschlagenen Verfahren bessere Ergebnisse erzielt werden können als mit vergleichbaren Methoden, sogar einschließlich einiger populärer Antivirenprogramme. In den meisten Experimenten kann die Methode Schadprogramme zuverlässig erkennen und erreicht Erkennungsraten von über 90% bei einer geringen Falsch-Positiv-Rate von 1%

    Information security and assurance : Proceedings international conference, ISA 2012, Shanghai China, April 2012

    Full text link

    Ensemble deep learning: A review

    Get PDF
    Ensemble learning combines several individual models to obtain better generalization performance. Currently, deep learning models with multilayer processing architecture is showing better performance as compared to the shallow or traditional classification models. Deep ensemble learning models combine the advantages of both the deep learning models as well as the ensemble learning such that the final model has better generalization performance. This paper reviews the state-of-art deep ensemble models and hence serves as an extensive summary for the researchers. The ensemble models are broadly categorised into ensemble models like bagging, boosting and stacking, negative correlation based deep ensemble models, explicit/implicit ensembles, homogeneous /heterogeneous ensemble, decision fusion strategies, unsupervised, semi-supervised, reinforcement learning and online/incremental, multilabel based deep ensemble models. Application of deep ensemble models in different domains is also briefly discussed. Finally, we conclude this paper with some future recommendations and research directions

    Efficient, Scalable, and Accurate Program Fingerprinting in Binary Code

    Get PDF
    Why was this binary written? Which compiler was used? Which free software packages did the developer use? Which sections of the code were borrowed? Who wrote the binary? These questions are of paramount importance to security analysts and reverse engineers, and binary fingerprinting approaches may provide valuable insights that can help answer them. This thesis advances the state of the art by addressing some of the most fundamental problems in program fingerprinting for binary code, notably, reusable binary code discovery, fingerprinting free open source software packages, and authorship attribution. First, to tackle the problem of discovering reusable binary code, we employ a technique for identifying reused functions by matching traces of a novel representation of binary code known as the semantic integrated graph. This graph enhances the control flow graph, the register flow graph, and the function call graph, key concepts from classical program analysis, and merges them with other structural information to create a joint data structure. Second, we approach the problem of fingerprinting free open source software (FOSS) packages by proposing a novel resilient and efficient system that incorporates three components. The first extracts the syntactical features of functions by considering opcode frequencies and performing a hidden Markov model statistical test. The second applies a neighborhood hash graph kernel to random walks derived from control flow graphs, with the goal of extracting the semantics of the functions. The third applies the z-score to normalized instructions to extract the behavior of the instructions in a function. Then, the components are integrated using a Bayesian network model which synthesizes the results to determine the FOSS function, making it possible to detect user-related functions. Third, with these elements now in place, we present a framework capable of decoupling binary program functionality from the coding habits of authors. To capture coding habits, the framework leverages a set of features that are based on collections of functionalityindependent choices made by authors during coding. Finally, it is well known that techniques such as refactoring and code transformations can significantly alter the structure of code, even for simple programs. Applying such techniques or changing the compiler and compilation settings can significantly affect the accuracy of available binary analysis tools, which severely limits their practicability, especially when applied to malware. To address these issues, we design a technique that extracts the semantics of binary code in terms of both data and control flow. The proposed technique allows more robust binary analysis because the extracted semantics of the binary code is generally immune from code transformation, refactoring, and varying the compilers or compilation settings. Specifically, it employs data-flow analysis to extract the semantic flow of the registers as well as the semantic components of the control flow graph, which are then synthesized into a novel representation called the semantic flow graph (SFG). We evaluate the framework on large-scale datasets extracted from selected open source C++ projects on GitHub, Google Code Jam events, Planet Source Code contests, and students’ programming projects and found that it outperforms existing methods in several respects. First, it is able to detect the reused functions. Second, it can identify FOSS packages in real-world projects and reused binary functions with high precision. Third, it decouples authorship from functionality so that it can be applied to real malware binaries to automatically generate evidence of similar coding habits. Fourth, compared to existing research contributions, it successfully attributes a larger number of authors with a significantly higher accuracy. Finally, the new framework is more robust than previous methods in the sense that there is no significant drop in accuracy when the code is subjected to refactoring techniques, code transformation methods, and different compilers
    • …
    corecore