56 research outputs found

    Malware Detection Using Dynamic Analysis

    Get PDF
    In this research, we explore the field of dynamic analysis which has shown promis- ing results in the field of malware detection. Here, we extract dynamic software birth- marks during malware execution and apply machine learning based detection tech- niques to the resulting feature set. Specifically, we consider Hidden Markov Models and Profile Hidden Markov Models. To determine the effectiveness of this dynamic analysis approach, we compare our detection results to the results obtained by using static analysis. We show that in some cases, significantly stronger results can be obtained using our dynamic approach

    Heap graph based software theft detection

    Get PDF
    published_or_final_versio

    JSBiRTH: Dynamic javascript birthmark based on the run-time heap

    Get PDF
    JavaScript is currently the dominating client-side scripting language in the web community. However, the source code of JavaScript can be easily copied through a browser. The intellectual property right of the developers lacks protection. In this paper, we consider using dynamic software birthmark for JavaScript. Instead of using control flow trace (which can be corrupted by code obfuscation) and API (which may not work if the software does not have many API calls), we exploit the run-time heap, which reflects substantially the dynamic behavior of a program, to extract birthmarks. We introduce JSBiRTH, a novel software birthmark system for JavaScript based on the comparison of run-time heaps. We evaluated our system using 20 JavaScript programs with most of them being large-scale. Our system gave no false positive or false negative. Moreover, it is robust against code obfuscation attack. We also show that our system is effective in detecting partial code theft. © 2011 IEEE.published_or_final_versionThe 35th IEEE Annual Computer Software and Applications Conference (COMPSAC 2011), Munich, Germany, 18-22 July 2011. In Proceedings of 35th COMPSAC, 2011, p. 407-41

    Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs

    Full text link
    Binary code analysis allows analyzing binary code without having access to the corresponding source code. A binary, after disassembly, is expressed in an assembly language. This inspires us to approach binary analysis by leveraging ideas and techniques from Natural Language Processing (NLP), a rich area focused on processing text of various natural languages. We notice that binary code analysis and NLP share a lot of analogical topics, such as semantics extraction, summarization, and classification. This work utilizes these ideas to address two important code similarity comparison problems. (I) Given a pair of basic blocks for different instruction set architectures (ISAs), determining whether their semantics is similar or not; and (II) given a piece of code of interest, determining if it is contained in another piece of assembly code for a different ISA. The solutions to these two problems have many applications, such as cross-architecture vulnerability discovery and code plagiarism detection. We implement a prototype system INNEREYE and perform a comprehensive evaluation. A comparison between our approach and existing approaches to Problem I shows that our system outperforms them in terms of accuracy, efficiency and scalability. And the case studies utilizing the system demonstrate that our solution to Problem II is effective. Moreover, this research showcases how to apply ideas and techniques from NLP to large-scale binary code analysis.Comment: Accepted by Network and Distributed Systems Security (NDSS) Symposium 201

    Graphs Resemblance based Software Birthmarks through Data Mining for Piracy Control

    Get PDF
    The emergence of software artifacts greatly emphasizes the need for protecting intellectual property rights (IPR) hampered by software piracy requiring effective measures for software piracy control. Software birthmarking targets to counter ownership theft of software by identifying similarity of their origins. A novice birthmarking approach has been proposed in this paper that is based on hybrid of text-mining and graph-mining techniques. The code elements of a program and their relations with other elements have been identified through their properties (i.e code constructs) and transformed into Graph Manipulation Language (GML). The software birthmarks generated by exploiting the graph theoretic properties (through clustering coefficient) are used for the classifications of similarity or dissimilarity of two programs. The proposed technique has been evaluated over metrics of credibility, resilience, method theft, modified code detection and self-copy detection for programs asserting the effectiveness of proposed approach against software ownership theft. The comparative analysis of proposed approach with contemporary ones shows better results for having properties and relations of program nodes and for employing dynamic techniques of graph mining without adding any overhead (such as increased program size and processing cost)

    Detecting code theft via a static instruction trace birthmark for Java methods

    Full text link
    Abstract—A software birthmark is an inherent program characteristic that can identify a program. In this paper, we propose a static instruction trace birthmark to detect code theft of Java methods. Because the static instruction traces can reflect the algorithmic structure of a program, our birthmark can be used to detect algorithm theft which existing static birthmarks cannot handle. Because the static instruction traces are extracted by static analyses, they can be applied to library programs which previous dynamic birthmarks could not. We evaluate the proposed birthmark with respect to two criteria: credibility and resilience. Experimental result shows that our birthmark is more resilient than and at least as credible as the existing Java birthmarks. I

    Software similarity and classification

    Full text link
    This thesis analyses software programs in the context of their similarity to other software programs. Applications proposed and implemented include detecting malicious software and discovering security vulnerabilities

    Exploring ICMetrics to detect abnormal program behaviour on embedded devices

    Get PDF
    Execution of unknown or malicious software on an embedded system may trigger harmful system behaviour targeted at stealing sensitive data and/or causing damage to the system. It is thus considered a potential and significant threat to the security of embedded systems. Generally, the resource constrained nature of Commercial off-the-shelf (COTS) embedded devices, such as embedded medical equipment, does not allow computationally expensive protection solutions to be deployed on these devices, rendering them vulnerable. A Self-Organising Map (SOM) based and Fuzzy C-means based approaches are proposed in this paper for detecting abnormal program behaviour to boost embedded system security. The presented technique extracts features derived from processor's Program Counter (PC) and Cycles per Instruction (CPI), and then utilises the features to identify abnormal behaviour using the SOM. Results achieved in our experiment show that the proposed SOM based and Fuzzy C-means based methods can identify unknown program behaviours not included in the training set with 90.9% and 98.7% accuracy

    Exploiting JavaScript Birthmarking Techniques for Code Theft Detection

    Get PDF
    Este relatório visa a análise de técnicas de birthmarking para detetar o roubo de código. Um birthmark de um software, como o próprio nome indica, é um conjunto de características únicas que permitem identificar esse mesmo software. Para a deteção do roubo de código são extraídos birthmarks de dois programas, o original e um suspeito, e são comparados um com o outro, permitindo assim detetar o roubo caso sejam muito semelhantes ou iguais. Como hoje em dia a internet é cada vez mais utilizada e o código JavaScript é usado para grande parte das aplicações web, o roubo de código nesta área é um grande problema da atualidade. Tendo isto em conta, a solução final tem como objetivo esta mesma linguagem.São analisadas, cronologicamente e tendo em conta a relevância para o tema, algumas das técnicas de birthmarking existentes. As técnicas são analisadas individualmente e no final é feito um resumo e comparação de todas as técnicas. Como a maior parte das técnicas existentes não foram pensadas para JavaScript, a sua aplicabilidade à linguagem é também analisada e são tiradas conclusões acerca de bons candidatos à solução final. O objetivo final é construir uma ferramenta que, usando uma técnica de birthmarking, determine se dois programas JavaScript foram copiados, de modo a suportar alegações de roubo de código.The purpose of this dissertation is to analyse birthmarking techniques in order to detect code theft. A birthmark of a software is, as the name suggests, the set of unique characteristics that allow to identify that software. In order to detect the theft, the birthmarks of two programs are extracted, the suspect and the original, and compared to each other to check if they are too similar or identical. Nowadays the web applications are growing and JavaScript code is the most used in this field, therefore the theft in this area is a current problem. Because of that, the theft detection of programs developed in that language is the focus of this dissertation.Some techniques of birthmarking are analysed in chronological order and accordingly to the relevance for the theme. Each technique is analysed individually and in the end a comparison between them is made. Given that most of the techniques were not created for JavaScript, their applicability to the language is analysed. With those analysis, some conclusions about the best candidates to the final solution are drawn.The final goal is to develop a tool, that uses a birthmarking technique, to determine if two JavaScript programs were copied, in order to support code theft allegations
    corecore