Toward a Taxonomy of Clones in Source Code: A Case Study

Abstract

Code cloning --- that is, the gratuitous duplication of source code within a software system --- is an endemic problem in large, industrial systems [9, 7]. While there has been much research into techniques for clone detection and analysis, there has been relatively little empirical study on characterizing how, where, and why clones occur in industrial software systems. In this paper, we present a preliminary categorization scheme for code clones, and we discuss how we have applied this taxonomy in a case study performed on the file system subsystem of the Linux operating system. Our case study yielded many surprising results, including that cloning is rampant both within particular file system implementations and across different ones, and that as many as 13% of the 4407 functions that are more than six lines long were involved in a clone-pair relationship

    Similar works

    Full text

    thumbnail-image

    Available Versions