38 research outputs found

    Assembly Code Clone Detection for Malware Binaries

    Get PDF
    Malware, such as a virus or trojan horse, refers to software designed specifically to gain unauthorized access to a computer system and perform malicious activities. To analyze a piece of malware, one may employ a reverse engineering approach to perform an in-depth analysis on the assembly code of a malware. Yet, the reverse engineering process is tedious and time consuming. One way to speed up the analysis process is to compare the disassembled malware with some previously analyzed malware, identify the similar functions in the assembly code, and transfer the comments from the previously analyzed software to the new malware. The challenge is how to efficiently identify the similar code fragments (i.e., clones) from a large repository of assembly code. In this thesis, an assembly code clone detection system is presented. Its performance is evaluated in terms of accuracy, efficiency, scalability, and feasibility of finding clones on assembly code decompiled from both Microsoft Windows 7 DLL files and real-life malware binary files. Experimental results suggest that the proposed clone detection algorithm is effective. This system can be used as the basis of future development of assembly code clone detection

    Assembly to Open Source Code Matching for Reverse Engineering and Malware Analysis

    Get PDF
    The process of software reverse engineering and malware analysis often comprise a combination of static and dynamic analyses. The successful outcome of each step is tightly coupled with the functionalities of the tools and skills of the reverse engineer. Even though automated tools are available for dynamic analysis, the static analysis process is a fastidious and time-consuming task as it requires manual work and strong expertise in assembly coding. In order to enhance and accelerate the reverse engineering process, we introduce a new dimension known as clone-based analysis. Recently, binary clone matching has been studied with a focus on detecting assembly (binary) clones. An alternative approach in clone analysis, which is studied in the present research, is concerned with assembly to source code matching. There are two major advantages in considering this extra dimension. The first advantage is to avoid dealing with low-level assembly code in situations where the corresponding high-level code is available. The other advantage is to prevent reverse engineering parts of the software that have been analyzed before. The clone-based analysis can be helpful in significantly reducing the required time and improving the accuracy of static analysis. In this research, we elaborate a framework for assembly to open-source code matching. Two types of analyses are provided by the framework, namely online and offline. The online analysis process triggers queries to online source code repositories based on extracted features from the functions at the assembly level. The result is the matched set of references to the open-source project files with similar features. Moreover, the offline analysis assigns functionality tags and provides in-depth information regarding the potential functionality of a portion of the assembly file. It reports on function stack frames, prototypes, arguments, variables, return values and low-level system calls. Besides, the offline analysis is based on a built-in dictionary of common user-level and kernel-level API functions that are used by malware to interact with the operating system. These functions are called for performing tasks such as file I/O, network communications, registry modification, and service manipulation. The offline analysis process has been expanded through an incremental learning mechanism which results in an improved detection of crypto-related functions in the disassembly. The other developed extension is a customized local code repository which performs automated source code parsing, feature extraction, and dataset generation for code matching. We apply the framework in several reverse engineering and malware analysis scenarios. Also, we show that the underlying tools and techniques are effective in providing additional insights into the functionality, inner workings, and components of the target binaries

    Registry composition in ambient networks

    Get PDF
    Ambient Networks (AN) is a new networking concept for beyond 3G. It is a product of the European Union's Sixth Framework Program (FP6). Network composition is a core concept of ANs. It allows dynamic, scalable and uniform cooperation between heterogeneous networks. ANs can host various registries. These registries may be of different types (e.g. centralized, distributed), store heterogeneous types of information (e.g. raw data vs. aggregated data), and rely on different interfaces to access the stored information (i.e. protocols or programming interfaces). When ANs compose, the hosted registries need to compose. Registry composition is a sub-process of network composition. It provides seamless and autonomous access to the content of all of the registries in the composed network. This thesis proposes a new architecture for registry composition in ANs. This overall architecture is made up of four components: interface interworking, data interworking, negotiation and signaling. Interface interworking enables dynamic intercommunication between registries with heterogeneous interfaces. Data interworking involves dynamically overcoming data heterogeneity (e.g. format and granularity). Interface and data interworking go beyond static interworking using gateways, as done today. The negotiation component allows the negotiation of the composition agreement. Signaling coordinates and regulates the negotiation and the execution of the composition agreement. Requirements are derived and related work is reviewed. We propose a new functional entity and a new procedure to orchestrate the composition process. We also propose a new architecture for interface interworking, based on a peer to peer overlay network. We have built a proof-of-concept prototype. The interface-interworking component is used as the basis of our new architecture to data interworking. This architecture reuses mechanisms and algorithms from the federated data base area. The thesis proposes as well a new architecture for on-line negotiation. The architecture includes a template for composition agreement proposals, and a negotiation protocol that was validated using SPIN. A new signaling framework is also proposed. It is based on the IETF Next Step in Signaling (NSIS) framework and was validated using OPNET. Most of these contributions are now part of the AN concept, as defined by the European Union's Sixth Framework Progra

    Dynamic communication across supply chain services

    Get PDF
    This thesis deals with the design of communication protocol solutions across a Supply Chain Management System. These solutions are capable of operating in multi-agent environments, and allow customers to order services online. As part of two Australian Research Council (ARC) grants, it is divided into four main sections. The first issue deals with a dynamic communication protocol, which aims at agent-to-agent operability in an open environment, such as the Internet. In the second section, we proposed a protocol correctness system, which enables detection of deadlock errors in communication protocols. Further, a comparison of the proposed validation techniques and those currently in use, is provided. Next, the problem of routing and scheduling in the transport industry was tackled, resulting in the development of an autonomous route scheduling system, MIDAS (Mobile Intelligent Distributed Application Software). The MIDAS server uses wireless technology to communicate with different parts of the system, which was investigated in the final section of the thesis. The MIDAS system was tested on devices with a GSM-enabled network connection, with results indicating that it takes less than thirty seconds for information to be processed and transmitted. Further, studies relating to this topic could involve extensions of the proposed systems using SOAP (Simple Object Access Protocol). While undertaking my PhD, I wrote the following five papers, which were published in various journals and conferences: 1. Towards the Right Communication Protocol for Web Services, International Journal for Web Services Research (IJWSR), June 2005 2. MIDAS - An Integrated E-Commerce Solution for the Australian Transport Industries, International Journal on Web Engineering and Technology (IJWET), 1(3), 353-373, October 2004 3. MIDAS’s Routing and Scheduling Approach for the Australian Transport Industries, International OTM (OntheMove) Workshops, November 2003 4. An XML-based Conversational Protocol for Web Services, 18th ACM International Symposium on Applied Computing (SAC), 1179-1184, May 2003 5. Towards Robust and Scalable Infrastructure for Web Service, IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), December 200

    Combining SOA and BPM Technologies for Cross-System Process Automation

    Get PDF
    This paper summarizes the results of an industry case study that introduced a cross-system business process automation solution based on a combination of SOA and BPM standard technologies (i.e., BPMN, BPEL, WSDL). Besides discussing major weaknesses of the existing, custom-built, solution and comparing them against experiences with the developed prototype, the paper presents a course of action for transforming the current solution into the proposed solution. This includes a general approach, consisting of four distinct steps, as well as specific action items that are to be performed for every step. The discussion also covers language and tool support and challenges arising from the transformation

    Cognitive Maps

    Get PDF
    undefine

    A Personal Research Agent for Semantic Knowledge Management of Scientific Literature

    Get PDF
    The unprecedented rate of scientific publications is a major threat to the productivity of knowledge workers, who rely on scrutinizing the latest scientific discoveries for their daily tasks. Online digital libraries, academic publishing databases and open access repositories grant access to a plethora of information that can overwhelm a researcher, who is looking to obtain fine-grained knowledge relevant for her task at hand. This overload of information has encouraged researchers from various disciplines to look for new approaches in extracting, organizing, and managing knowledge from the immense amount of available literature in ever-growing repositories. In this dissertation, we introduce a Personal Research Agent that can help scientists in discovering, reading and learning from scientific documents, primarily in the computer science domain. We demonstrate how a confluence of techniques from the Natural Language Processing and Semantic Web domains can construct a semantically-rich knowledge base, based on an inter-connected graph of scholarly artifacts – effectively transforming scientific literature from written content in isolation, into a queryable web of knowledge, suitable for machine interpretation. The challenges of creating an intelligent research agent are manifold: The agent's knowledge base, analogous to his 'brain', must contain accurate information about the knowledge `stored' in documents. It also needs to know about its end-users' tasks and background knowledge. In our work, we present a methodology to extract the rhetorical structure (e.g., claims and contributions) of scholarly documents. We enhance our approach with entity linking techniques that allow us to connect the documents with the Linked Open Data (LOD) cloud, in order to enrich them with additional information from the web of open data. Furthermore, we devise a novel approach for automatic profiling of scholarly users, thereby, enabling the agent to personalize its services, based on a user's background knowledge and interests. We demonstrate how we can automatically create a semantic vector-based representation of the documents and user profiles and utilize them to efficiently detect similar entities in the knowledge base. Finally, as part of our contributions, we present a complete architecture providing an end-to-end workflow for the agent to exploit the opportunities of linking a formal model of scholarly users and scientific publications

    Enhancing Trust –A Unified Meta-Model for Software Security Vulnerability Analysis

    Get PDF
    Over the last decade, a globalization of the software industry has taken place which has facilitated the sharing and reuse of code across existing project boundaries. At the same time, such global reuse also introduces new challenges to the Software Engineering community, with not only code implementation being shared across systems but also any vulnerabilities it is exposed to as well. Hence, vulnerabilities found in APIs no longer affect only individual projects but instead might spread across projects and even global software ecosystem borders. Tracing such vulnerabilities on a global scale becomes an inherently difficult task, with many of the resources required for the analysis not only growing at unprecedented rates but also being spread across heterogeneous resources. Software developers are struggling to identify and locate the required data to take full advantage of these resources. The Semantic Web and its supporting technology stack have been widely promoted to model, integrate, and support interoperability among heterogeneous data sources. This dissertation introduces four major contributions to address these challenges: (1) It provides a literature review of the use of software vulnerabilities databases (SVDBs) in the Software Engineering community. (2) Based on findings from this literature review, we present SEVONT, a Semantic Web based modeling approach to support a formal and semi-automated approach for unifying vulnerability information resources. SEVONT introduces a multi-layer knowledge model which not only provides a unified knowledge representation, but also captures software vulnerability information at different abstract levels to allow for seamless integration, analysis, and reuse of the modeled knowledge. The modeling approach takes advantage of Formal Concept Analysis (FCA) to guide knowledge engineers in identifying reusable knowledge concepts and modeling them. (3) A Security Vulnerability Analysis Framework (SV-AF) is introduced, which is an instantiation of the SEVONT knowledge model to support evidence-based vulnerability detection. The framework integrates vulnerability ontologies (and data) with existing Software Engineering ontologies allowing for the use of Semantic Web reasoning services to trace and assess the impact of security vulnerabilities across project boundaries. Several case studies are presented to illustrate the applicability and flexibility of our modelling approach, demonstrating that the presented knowledge modeling approach cannot only unify heterogeneous vulnerability data sources but also enables new types of vulnerability analysis

    A note on organizational learning and knowledge sharing in the context of communities of practice

    Get PDF
    Please, cite this publication as: Antonova, A. & Gourova, E. (2006). A note on organizational learning and knowledge sharing in the context of communities of practice. Proceedings of International Workshop in Learning Networks for Lifelong Competence Development, TENCompetence Conference. September 12th, Sofia, Bulgaria: TENCompetence. Retrieved June 30th, 2006, from http://dspace.learningnetworks.orgThe knowledge management (KM) literature emphasizes the impact of human factors for successful implementation of KM within the organization. Isolated initiatives for promoting learning organization and team collaboration, without taking consideration of the knowledge sharing limitations and constraints can defeat further development of KM culture. As an effective instrument for knowledge sharing, communities of practice (CoP) are appearing to overcome these constraints and to foster human collaboration.This work has been sponsored by the EU project TENCompetenc
    corecore