162 research outputs found

    Malware Detection Using Dynamic Analysis

    Get PDF
    In this research, we explore the field of dynamic analysis which has shown promis- ing results in the field of malware detection. Here, we extract dynamic software birth- marks during malware execution and apply machine learning based detection tech- niques to the resulting feature set. Specifically, we consider Hidden Markov Models and Profile Hidden Markov Models. To determine the effectiveness of this dynamic analysis approach, we compare our detection results to the results obtained by using static analysis. We show that in some cases, significantly stronger results can be obtained using our dynamic approach

    Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs

    Full text link
    Binary code analysis allows analyzing binary code without having access to the corresponding source code. A binary, after disassembly, is expressed in an assembly language. This inspires us to approach binary analysis by leveraging ideas and techniques from Natural Language Processing (NLP), a rich area focused on processing text of various natural languages. We notice that binary code analysis and NLP share a lot of analogical topics, such as semantics extraction, summarization, and classification. This work utilizes these ideas to address two important code similarity comparison problems. (I) Given a pair of basic blocks for different instruction set architectures (ISAs), determining whether their semantics is similar or not; and (II) given a piece of code of interest, determining if it is contained in another piece of assembly code for a different ISA. The solutions to these two problems have many applications, such as cross-architecture vulnerability discovery and code plagiarism detection. We implement a prototype system INNEREYE and perform a comprehensive evaluation. A comparison between our approach and existing approaches to Problem I shows that our system outperforms them in terms of accuracy, efficiency and scalability. And the case studies utilizing the system demonstrate that our solution to Problem II is effective. Moreover, this research showcases how to apply ideas and techniques from NLP to large-scale binary code analysis.Comment: Accepted by Network and Distributed Systems Security (NDSS) Symposium 201

    JSBiRTH: Dynamic javascript birthmark based on the run-time heap

    Get PDF
    JavaScript is currently the dominating client-side scripting language in the web community. However, the source code of JavaScript can be easily copied through a browser. The intellectual property right of the developers lacks protection. In this paper, we consider using dynamic software birthmark for JavaScript. Instead of using control flow trace (which can be corrupted by code obfuscation) and API (which may not work if the software does not have many API calls), we exploit the run-time heap, which reflects substantially the dynamic behavior of a program, to extract birthmarks. We introduce JSBiRTH, a novel software birthmark system for JavaScript based on the comparison of run-time heaps. We evaluated our system using 20 JavaScript programs with most of them being large-scale. Our system gave no false positive or false negative. Moreover, it is robust against code obfuscation attack. We also show that our system is effective in detecting partial code theft. © 2011 IEEE.published_or_final_versionThe 35th IEEE Annual Computer Software and Applications Conference (COMPSAC 2011), Munich, Germany, 18-22 July 2011. In Proceedings of 35th COMPSAC, 2011, p. 407-41

    Heap graph based software theft detection

    Get PDF
    published_or_final_versio

    An Approach Ahead Product Counterfeiting Identification for BIRTHMARKS in Light of DYKIS

    Get PDF
    Programming skin pigmentation will be an exceptional trademark of a project. Thus, thinking about the birthmarks between those plaintiff What's more respondent projects gives a compelling methodology for programming counterfeiting identification. However, programming skin pigmentation era appearances two principle challenges: the non attendance of source book What's more different code confusion systems that endeavour should shroud the aspects of a system. We recommend another sort for product skin pigmentation known as progressive magic direction book grouping (DYKIS) that might a chance to be concentrated from an executable without the have for source book. Those counterfeiting identification calculation In view of our new birthmarks will be versatile to both powerless confusion strategies for example, compiler optimizations and solid confusion systems executed clinched alongside instruments for example, such that sand mark, allatori What's more upx. We recommended an instrument known as DYKIS-PD (DYKIS counterfeiting identification tool) Furthermore require on direct examinations ahead vast number about double projects

    Software similarity and classification

    Full text link
    This thesis analyses software programs in the context of their similarity to other software programs. Applications proposed and implemented include detecting malicious software and discovering security vulnerabilities

    Exploring ICMetrics to detect abnormal program behaviour on embedded devices

    Get PDF
    Execution of unknown or malicious software on an embedded system may trigger harmful system behaviour targeted at stealing sensitive data and/or causing damage to the system. It is thus considered a potential and significant threat to the security of embedded systems. Generally, the resource constrained nature of Commercial off-the-shelf (COTS) embedded devices, such as embedded medical equipment, does not allow computationally expensive protection solutions to be deployed on these devices, rendering them vulnerable. A Self-Organising Map (SOM) based and Fuzzy C-means based approaches are proposed in this paper for detecting abnormal program behaviour to boost embedded system security. The presented technique extracts features derived from processor's Program Counter (PC) and Cycles per Instruction (CPI), and then utilises the features to identify abnormal behaviour using the SOM. Results achieved in our experiment show that the proposed SOM based and Fuzzy C-means based methods can identify unknown program behaviours not included in the training set with 90.9% and 98.7% accuracy

    A Method for Detecting Abnormal Program Behavior on Embedded Devices

    Get PDF
    A potential threat to embedded systems is the execution of unknown or malicious software capable of triggering harmful system behavior, aimed at theft of sensitive data or causing damage to the system. Commercial off-the-shelf embedded devices, such as embedded medical equipment, are more vulnerable as these type of products cannot be amended conventionally or have limited resources to implement protection mechanisms. In this paper, we present a self-organizing map (SOM)-based approach to enhance embedded system security by detecting abnormal program behavior. The proposed method extracts features derived from processor's program counter and cycles per instruction, and then utilises the features to identify abnormal behavior using the SOM. Results achieved in our experiment show that the proposed method can identify unknown program behaviors not included in the training set with over 98.4% accuracy

    a framework for automated similarity analysis of malware

    Get PDF
    Malware, a category of software including viruses, worms, and other malicious programs, is developed by hackers to damage, disrupt, or perform other harmful actions on data, computer systems and networks. Malware analysis, as an indispensable part of the work of IT security specialists, aims to gain an in-depth understanding of malware code. Manual analysis of malware is a very costly and time-consuming process. As more malware variants are evolved by hackers who occasionally use a copy-paste-modify programming style to accelerate the generation of large number of malware, the effort spent in analyzing similar pieces of malicious code has dramatically grown. One approach to remedy this situation is to automatically perform similarity analysis on malware samples and identify the functions they share in order to minimize duplicated effort in analyzing similar codes of malware variants. In this thesis, we present a framework to match cloned functions in a large chunk of malware samples. Firstly, the instructions of the functions to be analyzed are extracted from the disassembled malware binary code and then normalized. We propose a new similarity metric and use it to determine the pair-wise similarity among malware samples based on the calculated similarity of their functions. The developed tool also includes an API class recognizer designed to determine probable malicious operations that can be performed by malware functions. Furthermore, it allows us to visualize the relationship among functions inside malware codes and locate similar functions importing the same API class. We evaluate this framework on three malware datasets including metamorphic viruses created by malware generation tools, real-life malware variants in the wild, and two well-known botnet trojans. The obtained experimental results confirm that the proposed framework is effective in detecting similar malware code
    corecore