2,241 research outputs found

    Structured Review of the Evidence for Effects of Code Duplication on Software Quality

    Get PDF
    This report presents the detailed steps and results of a structured review of code clone literature. The aim of the review is to investigate the evidence for the claim that code duplication has a negative effect on code changeability. This report contains only the details of the review for which there is not enough place to include them in the companion paper published at a conference (Hordijk, Ponisio et al. 2009 - Harmfulness of Code Duplication - A Structured Review of the Evidence)

    apk2vec: Semi-supervised multi-view representation learning for profiling Android applications

    Full text link
    Building behavior profiles of Android applications (apps) with holistic, rich and multi-view information (e.g., incorporating several semantic views of an app such as API sequences, system calls, etc.) would help catering downstream analytics tasks such as app categorization, recommendation and malware analysis significantly better. Towards this goal, we design a semi-supervised Representation Learning (RL) framework named apk2vec to automatically generate a compact representation (aka profile/embedding) for a given app. More specifically, apk2vec has the three following unique characteristics which make it an excellent choice for largescale app profiling: (1) it encompasses information from multiple semantic views such as API sequences, permissions, etc., (2) being a semi-supervised embedding technique, it can make use of labels associated with apps (e.g., malware family or app category labels) to build high quality app profiles, and (3) it combines RL and feature hashing which allows it to efficiently build profiles of apps that stream over time (i.e., online learning). The resulting semi-supervised multi-view hash embeddings of apps could then be used for a wide variety of downstream tasks such as the ones mentioned above. Our extensive evaluations with more than 42,000 apps demonstrate that apk2vec's app profiles could significantly outperform state-of-the-art techniques in four app analytics tasks namely, malware detection, familial clustering, app clone detection and app recommendation.Comment: International Conference on Data Mining, 201

    On Using UML Diagrams to Identify and Assess Software Design Smells

    Get PDF
    Deficiencies in software design or architecture can severely impede and slow down the software development and maintenance progress. Bad smells and anti-patterns can be an indicator for poor software design and suggest for refactoring the affected source code fragment. In recent years, multiple techniques and tools have been proposed to assist software engineers in identifying smells and guiding them through corresponding refactoring steps. However, these detection tools only cover a modest amount of smells so far and also tend to produce false positives which represent conscious constructs with symptoms similar or identical to actual bad smells (e.g., design patterns). These and other issues in the detection process demand for a code or design review in order to identify (missed) design smells and/or re-assess detected smell candidates. UML diagrams are the quasi-standard for documenting software design and are often available in software projects. In this position paper, we investigate whether (and to what extent) UML diagrams can be used for identifying and assessing design smells. Based on a description of difficulties in the smell detection process, we discuss the importance of design reviews. We then investigate to what extent design documentation in terms of UML2 diagrams allows for representing and identifying software design smells. In particular, 14 kinds of design smells and their representability in UML class and sequence diagrams are analyzed. In addition, we discuss further challenges for UML-based identification and assessment of bad smells

    Primary Structure and Catalytic Mechanism of the Epoxide Hydrolase from Agrobacterium radiobacter AD1

    Get PDF
    The epoxide hydrolase gene from Agrobacterium radiobacter AD1, a bacterium that is able to grow on epichlorohydrin as the sole carbon source, was cloned by means of the polymerase chain reaction with two degenerate primers based on the N-terminal and C-terminal sequences of the enzyme. The epoxide hydrolase gene coded for a protein of 294 amino acids with a molecular mass of 34 kDa. An identical epoxide hydrolase gene was cloned from chromosomal DNA of the closely related strain A. radiobacter CFZ11. The recombinant epoxide hydrolase was expressed up to 40% of the total cellular protein content in Escherichia coli BL21(DE3) and the purified enzyme had a kcat of 21 s-1 with epichlorohydrin. Amino acid sequence similarity of the epoxide hydrolase with eukaryotic epoxide hydrolases, haloalkane dehalogenase from Xanthobacter autotrophicus GJ10, and bromoperoxidase A2 from Streptomyces aureofaciens indicated that it belonged to the ฮฑ/ฮฒ-hydrolase fold family. This conclusion was supported by secondary structure predictions and analysis of the secondary structure with circular dichroism spectroscopy. The catalytic triad residues of epoxide hydrolase are proposed to be Asp107, His275, and Asp246. Replacement of these residues to Ala/Glu, Arg/Gln, and Ala, respectively, resulted in a dramatic loss of activity for epichlorohydrin. The reaction mechanism of epoxide hydrolase proceeds via a covalently bound ester intermediate, as was shown by single turnover experiments with the His275 โ†’ Arg mutant of epoxide hydrolase in which the ester intermediate could be trapped.

    Does BLEU Score Work for Code Migration?

    Full text link
    Statistical machine translation (SMT) is a fast-growing sub-field of computational linguistics. Until now, the most popular automatic metric to measure the quality of SMT is BiLingual Evaluation Understudy (BLEU) score. Lately, SMT along with the BLEU metric has been applied to a Software Engineering task named code migration. (In)Validating the use of BLEU score could advance the research and development of SMT-based code migration tools. Unfortunately, there is no study to approve or disapprove the use of BLEU score for source code. In this paper, we conducted an empirical study on BLEU score to (in)validate its suitability for the code migration task due to its inability to reflect the semantics of source code. In our work, we use human judgment as the ground truth to measure the semantic correctness of the migrated code. Our empirical study demonstrates that BLEU does not reflect translation quality due to its weak correlation with the semantic correctness of translated code. We provided counter-examples to show that BLEU is ineffective in comparing the translation quality between SMT-based models. Due to BLEU's ineffectiveness for code migration task, we propose an alternative metric RUBY, which considers lexical, syntactical, and semantic representations of source code. We verified that RUBY achieves a higher correlation coefficient with the semantic correctness of migrated code, 0.775 in comparison with 0.583 of BLEU score. We also confirmed the effectiveness of RUBY in reflecting the changes in translation quality of SMT-based translation models. With its advantages, RUBY can be used to evaluate SMT-based code migration models.Comment: 12 pages, 5 figures, ICPC '19 Proceedings of the 27th International Conference on Program Comprehensio

    A novel approach for Software Clone detection using Data Mining in Software

    Get PDF
    The Similar Program structures which recur in variant forms in software systems are code clones. Many techniques are proposed in order to detect similar code fragments in software. The software maintenance is generally helped by maintenance is generally helped by the identification and subsequent unification. When the patterns of simple clones reoccur, it is an indication for the presence of interesting higher-level similarities. They are called as Structural Clones. The structural clones when compared to simple clones show a bigger picture of similarities. The problem of huge number of clones is alleviated by the structural clones, which are part of logical groups of simple clones. In order to understand the design of the system for better maintenance and reengineering for reuse, detection of structural clones is essential. In this paper, a technique which is useful to detect some useful types of structural clones is proposed. The novelty of the present approach comprises the formulation of the structural clone concept and the application of data mining techniques. A novel approach is useful for implementation of the proposed technique is described

    Structured Review of Code Clone Literature

    Get PDF
    This report presents the results of a structured review of code clone literature. The aim of the review is to assemble a conceptual model of clone-related concepts which helps us to reason about clones. This conceptual model unifies clone concepts from a wide range of literature, so that findings about clones can be compared with each other
    • โ€ฆ
    corecore