78,684 research outputs found

    An Abstract Method Linearization for Detecting Source Code Plagiarism in Object-Oriented Environment

    Full text link
    Despite the fact that plagiarizing source code is a trivial task for most CS students, detecting such unethical behavior requires a considerable amount of effort. Thus, several plagiarism detection systems were developed to handle such issue. This paper extends Karnalim's work, a low-level approach for detecting Java source code plagiarism, by incorporating abstract method linearization. Such extension is incorporated to enhance the accuracy of low-level approach in term of detecting plagiarism in object-oriented environment. According to our evaluation, which was conducted based on 23 design-pattern source code pairs, our extended low-level approach is more effective than state-of-the-art and Karnalim's approach. On the one hand, when compared to state-of-the-art approach, our approach can generate less coincidental similarities and provide more accurate result. On the other hand, when compared to Karnalim's approach, our approach, at some extent, can generate higher similarity when simple abstract method invocation is incorporated.Comment: The 8th International Conference on Software Engineering and Service Scienc

    Structured Review of the Evidence for Effects of Code Duplication on Software Quality

    Get PDF
    This report presents the detailed steps and results of a structured review of code clone literature. The aim of the review is to investigate the evidence for the claim that code duplication has a negative effect on code changeability. This report contains only the details of the review for which there is not enough place to include them in the companion paper published at a conference (Hordijk, Ponisio et al. 2009 - Harmfulness of Code Duplication - A Structured Review of the Evidence)

    Structured Review of Code Clone Literature

    Get PDF
    This report presents the results of a structured review of code clone literature. The aim of the review is to assemble a conceptual model of clone-related concepts which helps us to reason about clones. This conceptual model unifies clone concepts from a wide range of literature, so that findings about clones can be compared with each other

    The System Kato: Detecting Cases of Plagiarism for Answer-Set Programs

    Full text link
    Plagiarism detection is a growing need among educational institutions and solutions for different purposes exist. An important field in this direction is detecting cases of source-code plagiarism. In this paper, we present the tool Kato for supporting the detection of this kind of plagiarism in the area of answer-set programming (ASP). Currently, the tool is implemented for DLV programs but it is designed to handle other logic-programming dialects as well. We review the basic features of Kato, introduce its theoretical underpinnings, and discuss an application of Kato for plagiarism detection in the context of courses on logic programming at the Vienna University of Technology

    MRFalign: Protein Homology Detection through Alignment of Markov Random Fields

    Full text link
    Sequence-based protein homology detection has been extensively studied and so far the most sensitive method is based upon comparison of protein sequence profiles, which are derived from multiple sequence alignment (MSA) of sequence homologs in a protein family. A sequence profile is usually represented as a position-specific scoring matrix (PSSM) or an HMM (Hidden Markov Model) and accordingly PSSM-PSSM or HMM-HMM comparison is used for homolog detection. This paper presents a new homology detection method MRFalign, consisting of three key components: 1) a Markov Random Fields (MRF) representation of a protein family; 2) a scoring function measuring similarity of two MRFs; and 3) an efficient ADMM (Alternating Direction Method of Multipliers) algorithm aligning two MRFs. Compared to HMM that can only model very short-range residue correlation, MRFs can model long-range residue interaction pattern and thus, encode information for the global 3D structure of a protein family. Consequently, MRF-MRF comparison for remote homology detection shall be much more sensitive than HMM-HMM or PSSM-PSSM comparison. Experiments confirm that MRFalign outperforms several popular HMM or PSSM-based methods in terms of both alignment accuracy and remote homology detection and that MRFalign works particularly well for mainly beta proteins. For example, tested on the benchmark SCOP40 (8353 proteins) for homology detection, PSSM-PSSM and HMM-HMM succeed on 48% and 52% of proteins, respectively, at superfamily level, and on 15% and 27% of proteins, respectively, at fold level. In contrast, MRFalign succeeds on 57.3% and 42.5% of proteins at superfamily and fold level, respectively. This study implies that long-range residue interaction patterns are very helpful for sequence-based homology detection. The software is available for download at http://raptorx.uchicago.edu/download/.Comment: Accepted by both RECOMB 2014 and PLOS Computational Biolog

    Mal-Netminer: Malware Classification Approach based on Social Network Analysis of System Call Graph

    Get PDF
    As the security landscape evolves over time, where thousands of species of malicious codes are seen every day, antivirus vendors strive to detect and classify malware families for efficient and effective responses against malware campaigns. To enrich this effort, and by capitalizing on ideas from the social network analysis domain, we build a tool that can help classify malware families using features driven from the graph structure of their system calls. To achieve that, we first construct a system call graph that consists of system calls found in the execution of the individual malware families. To explore distinguishing features of various malware species, we study social network properties as applied to the call graph, including the degree distribution, degree centrality, average distance, clustering coefficient, network density, and component ratio. We utilize features driven from those properties to build a classifier for malware families. Our experimental results show that influence-based graph metrics such as the degree centrality are effective for classifying malware, whereas the general structural metrics of malware are less effective for classifying malware. Our experiments demonstrate that the proposed system performs well in detecting and classifying malware families within each malware class with accuracy greater than 96%.Comment: Mathematical Problems in Engineering, Vol 201
    • โ€ฆ
    corecore