6,886 research outputs found

    The Impact of Programming Languages in Code Cloning

    Get PDF
    Code cloning is a duplication of source code fragments that frequently occurs in large software systems. Although different studies exist that evidence cloning benefits, several others expose its harmfulness, specifically upon inconsistent clone management. One important cause for the creation of software clones is the inherent abstraction capabilities and terseness of the programming language being used. This paper focuses on the features of two different programming languages, namely Java and Scala, and studies how different language constructs can induce or reduce code cloning. This study was further developed using our tool Kamino which provided clone detection and concrete values

    Towards Semantic Clone Detection, Benchmarking, and Evaluation

    Get PDF
    Developers copy and paste their code to speed up the development process. Sometimes, they copy code from other systems or look up code online to solve a complex problem. Developers reuse copied code with or without modifications. The resulting similar or identical code fragments are called code clones. Sometimes clones are unintentionally written when a developer implements the same or similar functionality. Even when the resulting code fragments are not textually similar but implement the same functionality they are still considered to be clones and are classified as semantic clones. Semantic clones are defined as code fragments that perform the exact same computation and are implemented using different syntax. Software cloning research indicates that code clones exist in all software systems; on average, 5% to 20% of software code is cloned. Due to the potential impact of clones, whether positive or negative, it is essential to locate, track, and manage clones in the source code. Considerable research has been conducted on all types of code clones, including clone detection, analysis, management, and evaluation. Despite the great interest in code clones, there has been considerably less work conducted on semantic clones. As described in this thesis, I advance the state-of-the-art in semantic clone research in several ways. First, I conducted an empirical study to investigate the status of code cloning in and across open-source game systems and the effectiveness of different normalization, filtering, and transformation techniques for detecting semantic clones. Second, I developed an approach to detect clones across .NET programming languages using an intermediate language. Third, I developed a technique using an intermediate language and an ontology to detect semantic clones. Fourth, I mined Stack Overflow answers to build a semantic code clone benchmark that represents real semantic code clones in four programming languages, C, C#, Java, and Python. Fifth, I defined a comprehensive taxonomy that identifies semantic clone types. Finally, I implemented an injection framework that uses the benchmark to compare and evaluate semantic code clone detectors by automatically measuring recall

    Evaluating Maintainability Prejudices with a Large-Scale Study of Open-Source Projects

    Full text link
    Exaggeration or context changes can render maintainability experience into prejudice. For example, JavaScript is often seen as least elegant language and hence of lowest maintainability. Such prejudice should not guide decisions without prior empirical validation. We formulated 10 hypotheses about maintainability based on prejudices and test them in a large set of open-source projects (6,897 GitHub repositories, 402 million lines, 5 programming languages). We operationalize maintainability with five static analysis metrics. We found that JavaScript code is not worse than other code, Java code shows higher maintainability than C# code and C code has longer methods than other code. The quality of interface documentation is better in Java code than in other code. Code developed by teams is not of higher and large code bases not of lower maintainability. Projects with high maintainability are not more popular or more often forked. Overall, most hypotheses are not supported by open-source data.Comment: 20 page

    Kevoree Modeling Framework (KMF): Efficient modeling techniques for runtime use

    Get PDF
    The creation of Domain Specific Languages(DSL) counts as one of the main goals in the field of Model-Driven Software Engineering (MDSE). The main purpose of these DSLs is to facilitate the manipulation of domain specific concepts, by providing developers with specific tools for their domain of expertise. A natural approach to create DSLs is to reuse existing modeling standards and tools. In this area, the Eclipse Modeling Framework (EMF) has rapidly become the defacto standard in the MDSE for building Domain Specific Languages (DSL) and tools based on generative techniques. However, the use of EMF generated tools in domains like Internet of Things (IoT), Cloud Computing or Models@Runtime reaches several limitations. In this paper, we identify several properties the generated tools must comply with to be usable in other domains than desktop-based software systems. We then challenge EMF on these properties and describe our approach to overcome the limitations. Our approach, implemented in the Kevoree Modeling Framework (KMF), is finally evaluated according to the identified properties and compared to EMF.Comment: ISBN 978-2-87971-131-7; N° TR-SnT-2014-11 (2014

    Stack Overflow in Github: Any Snippets There?

    Full text link
    When programmers look for how to achieve certain programming tasks, Stack Overflow is a popular destination in search engine results. Over the years, Stack Overflow has accumulated an impressive knowledge base of snippets of code that are amply documented. We are interested in studying how programmers use these snippets of code in their projects. Can we find Stack Overflow snippets in real projects? When snippets are used, is this copy literal or does it suffer adaptations? And are these adaptations specializations required by the idiosyncrasies of the target artifact, or are they motivated by specific requirements of the programmer? The large-scale study presented on this paper analyzes 909k non-fork Python projects hosted on Github, which contain 290M function definitions, and 1.9M Python snippets captured in Stack Overflow. Results are presented as quantitative analysis of block-level code cloning intra and inter Stack Overflow and GitHub, and as an analysis of programming behaviors through the qualitative analysis of our findings.Comment: 14th International Conference on Mining Software Repositories, 11 page

    Pirate plunder: game-based computational thinking using scratch blocks

    Get PDF
    Policy makers worldwide argue that children should be taught how technology works, and that the ‘computational thinking’ skills developed through programming are useful in a wider context. This is causing an increased focus on computer science in primary and secondary education. Block-based programming tools, like Scratch, have become ubiquitous in primary education (5 to 11-years-old) throughout the UK. However, Scratch users often struggle to detect and correct ‘code smells’ (bad programming practices) such as duplicated blocks and large scripts, which can lead to programs that are difficult to understand. These ‘smells’ are caused by a lack of abstraction and decomposition in programs; skills that play a key role in computational thinking. In Scratch, repeats (loops), custom blocks (procedures) and clones (instances) can be used to correct these smells. Yet, custom blocks and clones are rarely taught to children under 11-years-old. We describe the design of a novel educational block-based programming game, Pirate Plunder, which aims to teach these skills to children aged 9-11. Players use Scratch blocks to navigate around a grid, collect items and interact with obstacles. Blocks are explained in ‘tutorials’; the player then completes a series of ‘challenges’ before attempting the next tutorial. A set of Scratch blocks, including repeats, custom blocks and clones, are introduced in a linear difficulty progression. There are two versions of Pirate Plunder; one that uses a debugging-first approach, where the player is given a program that is incomplete or incorrect, and one where each level begins with an empty program. The game design has been developed through iterative playtesting. The observations made during this process have influenced key design decisions such as Scratch integration, difficulty progression and reward system. In future, we will evaluate Pirate Plunder against a traditional Scratch curriculum and compare the debugging-first and non-debugging versions in a series of studies

    Structured Review of the Evidence for Effects of Code Duplication on Software Quality

    Get PDF
    This report presents the detailed steps and results of a structured review of code clone literature. The aim of the review is to investigate the evidence for the claim that code duplication has a negative effect on code changeability. This report contains only the details of the review for which there is not enough place to include them in the companion paper published at a conference (Hordijk, Ponisio et al. 2009 - Harmfulness of Code Duplication - A Structured Review of the Evidence)
    • 

    corecore