Search CORE

1,056 research outputs found

コウソクカツコウセイドナジクタンイノコードクローンケンシュツ

Author: ムラカミヒロアキ
村上寛明
Publication venue: 'Springer Publishing Company'
Publication date
Field of study

Structured Review of the Evidence for Effects of Code Duplication on Software Quality

Author: Hordijk Wiebe
Ponisio María Laura
Wieringa Roel
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2009
Field of study

This report presents the detailed steps and results of a structured review of code clone literature. The aim of the review is to investigate the evidence for the claim that code duplication has a negative effect on code changeability. This report contains only the details of the review for which there is not enough place to include them in the companion paper published at a conference (Hordijk, Ponisio et al. 2009 - Harmfulness of Code Duplication - A Structured Review of the Evidence)

University of Twente Research Information

Structured Review of Code Clone Literature

Author: Hordijk Wiebe
Ponisio María Laura
Wieringa Roel
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2008
Field of study

This report presents the results of a structured review of code clone literature. The aim of the review is to assemble a conceptual model of clone-related concepts which helps us to reason about clones. This conceptual model unifies clone concepts from a wide range of literature, so that findings about clones can be compared with each other

University of Twente Research Information

SOAP3-dp: Fast, Accurate and Sensitive GPU-based Short Read Aligner

Author: Chang Yu
Chi-Man Liu
David W Cheung
Edward Wu
Haoxiang Lin
Hing-Fung Ting
Jianqiao Zhu
Lap-Kei Lee
Ruibang Luo
Ruiqiang Li
Shaoliang Peng
Siu-Ming Yiu
Tak-Wah Lam
Thomas Wong
Wenjuan Zhu
Xiaoqian Zhu
Yingrui Li
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

To tackle the exponentially increasing throughput of Next-Generation Sequencing (NGS), most of the existing short-read aligners can be configured to favor speed in trade of accuracy and sensitivity. SOAP3-dp, through leveraging the computational power of both CPU and GPU with optimized algorithms, delivers high speed and sensitivity simultaneously. Compared with widely adopted aligners including BWA, Bowtie2, SeqAlto, GEM and GPU-based aligners including BarraCUDA and CUSHAW, SOAP3-dp is two to tens of times faster, while maintaining the highest sensitivity and lowest false discovery rate (FDR) on Illumina reads with different lengths. Transcending its predecessor SOAP3, which does not allow gapped alignment, SOAP3-dp by default tolerates alignment similarity as low as 60 percent. Real data evaluation using human genome demonstrates SOAP3-dp's power to enable more authentic variants and longer Indels to be discovered. Fosmid sequencing shows a 9.1 percent FDR on newly discovered deletions. SOAP3-dp natively supports BAM file format and provides a scoring scheme same as BWA, which enables it to be integrated into existing analysis pipelines. SOAP3-dp has been deployed on Amazon-EC2, NIH-Biowulf and Tianhe-1A.Comment: 21 pages, 6 figures, submitted to PLoS ONE, additional files available at "https://www.dropbox.com/sh/bhclhxpoiubh371/O5CO_CkXQE". Comments most welcom

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

HKU Scholars Hub

FigShare

Revealing Missing Bug-Fixes in Code Clones in Large-Scale Code Bases

Author: Juergens Elmar
Poehlmann Martin
Publication venue: European Association of Software Science and Technology
Publication date: 29/07/2013
Field of study

When a bug is fixed in duplicated code, it is often necessary to modify all duplicates (so-called clones) accordingly.In practice, however, fixes are often incomplete, which causes the bug to remain in one or more of the clones.This paper presents an approach that detects such incomplete bug-fixes in cloned code by analyzing a system's version history to reveal those commits that fix problems.The approach then performs incremental clone detection to reveal those clones that became inconsistent as a result of such a fix.We present results from a case study that analyzed incomplete bug-fixes in six industrial and open-source systems to demonstrate the feasibility and defectiveness of our approach.We identified likely incomplete bug-fixes in all analyzed systems

Electronic Communications of the EASST (European Association of Software Science and Technology)

A novel approach for Software Clone detection using Data Mining in Software

Author: G. Anil Kumar
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 26/02/2016
Field of study

The Similar Program structures which recur in variant forms in software systems are code clones. Many techniques are proposed in order to detect similar code fragments in software. The software maintenance is generally helped by maintenance is generally helped by the identification and subsequent unification. When the patterns of simple clones reoccur, it is an indication for the presence of interesting higher-level similarities. They are called as Structural Clones. The structural clones when compared to simple clones show a bigger picture of similarities. The problem of huge number of clones is alleviated by the structural clones, which are part of logical groups of simple clones. In order to understand the design of the system for better maintenance and reengineering for reuse, detection of structural clones is essential. In this paper, a technique which is useful to detect some useful types of structural clones is proposed. The novelty of the present approach comprises the formulation of the structural clone concept and the application of data mining techniques. A novel approach is useful for implementation of the proposed technique is described

International Journal on Recent and Innovation Trends in Computing and Communication

Koodikloonien hyödyntäminen asiakaskohtaisten erojen havaitsemiseksi tuotteistusprosessissa

Author: Salmivaara Erika
Publication venue
Publication date: 30/10/2017
Field of study

The topic for this thesis was inspired by two case studies. The case studies are applications that are conceptually but not technically products. Their code bases contain customer-specific branches. The development strategy with the case studies has been forking an existing branch and customizing it to the needs of the new client. Code reuse and forking can be an efficient or even a necessary development strategy due to time pressure. However, code duplication may result in harder maintainability of the code base which in turn increases the maintenance costs. Finding similar code fragments is researched in the field of code clone detection. Code clones are code fragments that are either the same or similar. The similarity can be categorized into 4 types. Type I clones are exact matches that differ only in layout, whitespace or comments. In addition to type I changes, type II clones can differ in identifier names and types or literal values. Furthermore, type III clones can have statements added, deleted or modified within the code fragments under comparison. Type IV clones are functionally similar clones. There are different kinds of techniques and tools for both detecting and visualizing clones. Different techniques find different sets of clone types. Code clone visualizations present both the overview of the cloning situation, and the details in the source code level. The branches of the same product of the case studies can be considered as clones of each other. They are expected to remind type III clones. They essentially originate from the same code base, but each one has added, deleted and modified statements within the corresponding files between the other branches. Identifying these changes facilitate forming an overall picture of how much the branches truly differ. The transformation process from development of customer-specific software to product software is called productization. In order to productize, the differences in the branches must be determined. Each customization needs to be considered in the productization process to avoid reducing the value of the product. We defined a process how to utilize code clone visualizations to explore differences between customer-specific branches. Conclusion of this thesis is that utilization of code clones clearly expedites the productization process. The visualizations aid to locate the differences much faster than manually. Code clone detection is applied to fade out the uninteresting differences between the branches. Hence, the method aids to navigate to the truly interesting customizations that require manual inspection. The method also provides a general view of the cloning situation, which eases the task of estimating the workload. The process is applicable in situations, where the diverged code bases are expected to remind each other structurally, yet contain so many changes that a manual comparison of the branches with file comparison tools would be too time-consuming.Motivaatio diplomityön tekemiselle syntyi kahden tapaustutkimuksen johdosta. Ne käsittelevät sovelluksia, jotka ovat käsitteellisellä tasolla tuotteita, mutta eivät teknisesti. Niiden lähdekoodit sisältävät asiakaskohtaisia haaroja. Kehitysstrategia sovellusten kohdalla on ollut haarauttaa koodipohja asiakaskohtaiseksi koodipohjaksi ja muokata se asiakastoiveiden mukaiseksi. Koodin uusiokäyttö voi olla tehokas tai jopa tarvittava kehitysstrategia aikataulupaineiden johdosta. Toisteinen koodi voi kuitenkin hankaloittaa sovellusten ylläpitoa ja täten nostaa ylläpitokustannuksia. Samankaltaisten koodin osien etsimistä on tutkittu koodikloonien tutkimusalalla. Koodikloonit ovat koodin osia, jotka ovat joko samoja tai samankaltaisia. Samankaltaisuus voidaan luokitella neljään tyyppiin. Tyypin I kloonit eroavat vain ulkoasun, tyhjätilamerkkien tai kommenttien osalta. Tyypin II kloonit voivat erota myös muuttujien nimien tai tyyppien osalta tai literaalien arvoissa. Tyypin III klooneissa voi olla lisättyjä, poistettuja tai muuttuneita lauseita välissä. Tyypin IV kloonit ovat toiminnaltaan samankaltaisia. Koodikloonien tunnistamiseen ja visualisointiin on erilaisia menetelmiä. Eri tekniikat löytävät eri tyyppisiä klooneja. Koodiklooneista voidaan visualisoida sekä kokonaiskuva kloonaustilanteesta että yksityiskohdat lähdekooditasolla. Saman tuotteen haarat tapaustutkimuksissamme voidaan ajatella olevan tyypin III klooneja toisistaan. Ne periytyvät alun perin samasta koodipohjasta, mutta jokaisessa on lisättyjä, poistettuja ja muutettuja lauseita toisiaan vastaavien tiedostojen välillä. Nämä muutokset halutaan havaita, jotta voimme saada kokonaiskuvan siitä, kuinka paljon haarat todellisuudessa eroavat toisistaan. Tutkimuksen kohteena oli tuotteistusprosessi, jossa asiakaskohtaisesti räätälöidyt koodipohjat pyrittiin muuntamaan yhdeksi tuotteeksi. Tavoitteena oli selvittää kaikkien koodipohjien asiakaskohtaisesti räätälöidyt osat, jotta ne tulisivat huomioitua tuotteistusprosessissa. Jokainen räätälöinti voi olla tuotteen arvoa nostava tekijä. Kehitimme prosessin, jonka mukaisesti kloonien visualisointeja voidaan käyttää tuotteistusprosessissa. Tutkimuksessa havaittiin, että koodikloonien hyödyntäminen nopeutti selkeästi tutkimuskohteiden tuotteistusprosessia. Visualiointien avulla erot löydetään huomattavasti nopeammin kuin manuaalisesti. Kloonien tunnistusmenetelmiä käytetään tässä yhteydessä häivyttämään koodipohjasta epäkiinnostavat erot. Täten menetelmä ohjaa niiden erojen äärelle, joiden tarkastelu oikeasti vaatii manuaalista tulkintaa. Menetelmä antaa myös kokonaiskuvan tilanteesta, mikä helpottaa tuotteistamiseen tarvittavien työmääräarvioiden tekemistä. Menetelmä sopii tilanteisiin, jossa toisistaan erkaantuneet koodipohjat muistuttavat vielä rakenteeltaan toisiaan, mutta sisältävät niin paljon muutoksia, että käsin tehtävä koodihaarojen vertailu tiedostojen vertailuun tarkoitetulla työkalulla olisi liian aikaa vievää

Aaltodoc Publication Archive

Enhancing source-based clone detection using intermediate representation

Author: Gehan M K Selim
King Chun Foo
Ying Zou
Publication venue
Publication date: 01/01/2010
Field of study

Abstract-Detecting software clones in large scale projects helps improve the maintainability of large code bases. The source code representation (e.g., Java or C files) of a software system has traditionally been used for clone detection. In this paper, we propose a technique that transforms the source code to an intermediate representation, and then reuses established source-based clone detection techniques to detect clones in the intermediate representation. The clones are mapped back to the source code and are used to augment the results reported by source-based clone detection. We demonstrate the performance of our new technique using systems from the Bellon clone evaluation benchmark. The result shows that our technique can detect Type 3 clones. Our technique has higher recall with minimal drop in precision using Bellon corpus. By examining the complete clone groups, our technique has higher precision than the standalone string based and token based clone detectors

CiteSeerX

An Extended Stable Marriage Problem Algorithm for Clone Detection

Author: AlHakami Hosam
Chen Feng
Janicke Helge
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/01/2014
Field of study

Code cloning negatively affects industrial software and threatens intellectual property. This paper presents a novel approach to detecting cloned software by using a bijective matching technique. The proposed approach focuses on increasing the range of similarity measures and thus enhancing the precision of the detection. This is achieved by extending a well-known stable-marriage problem (SMP) and demonstrating how matches between code fragments of different files can be expressed. A prototype of the proposed approach is provided using a proper scenario, which shows a noticeable improvement in several features of clone detection such as scalability and accuracy.Comment: 20 pages, 10 figures, 6 table

arXiv.org e-Print Archive

CiteSeerX