851 research outputs found
An Extended Stable Marriage Problem Algorithm for Clone Detection
Code cloning negatively affects industrial software and threatens
intellectual property. This paper presents a novel approach to detecting cloned
software by using a bijective matching technique. The proposed approach focuses
on increasing the range of similarity measures and thus enhancing the precision
of the detection. This is achieved by extending a well-known stable-marriage
problem (SMP) and demonstrating how matches between code fragments of different
files can be expressed. A prototype of the proposed approach is provided using
a proper scenario, which shows a noticeable improvement in several features of
clone detection such as scalability and accuracy.Comment: 20 pages, 10 figures, 6 table
Structured Review of Code Clone Literature
This report presents the results of a structured review of code clone literature. The aim of the review is to assemble a conceptual model of clone-related concepts which helps us to reason about clones. This conceptual model unifies clone concepts from a wide range of literature, so that findings about clones can be compared with each other
A comparative analysis of web-based GIS applications using usability metrics
With the rapid expansion of the internet, Web-based Geographic Information System (WGIS) applications have gained popularity, despite the interface of the WGIS application being difficult to learn and understand because special functions are needed to manipulate the maps. Hence, it is essential to evaluate the usability of WGIS applications. Usability is an important factor in ensuring the development of quality, usable software products. On the other hand, there are a number of standards and models in the literature, each of which describes usability in terms of various set of attributes. These models are vague and difficult to understand. Therefore, the primary purpose of this study is to compare five common usability models (Shackel, Nielsen, ISO 9241 P-11, ISO 9126-1 and QUIM) to identify usability metrics that have most frequently used in the previous models. The questionnaire method and the automated usability evaluation method by using Loop11 tool were used, in order to evaluate the usability metrics for three case studies of commonly used WGIS applications as Google maps, Yahoo maps, and MapQuest. Finally, those case studies were compared and analysed based on usability metrics that have been identified. Based on a comparative study, four usability metrics (Effectiveness, Efficiency, Satisfaction and Learnability) were identified. Those usability metrics were characterized by consistent, comprehensive, not vaguely and proper to evaluate the usability of WGIS applications. In addition, there was a positive correlation between these usability metrics. The comparative analysis indicates that Effectiveness, Satisfaction and Learnability were higher, and the Efficiency was lesser by using the Loop11 tool compared to questionnaire method for the three case studies. In addition, Yahoo Maps and MapQuest have usability metrics rate lesser than Google Maps by applying two methods. Therefore, Google Maps is more usable compared to Yahoo Maps and MapQuest
Detecting differences across multiple instances of code clones
Clone detectors find similar code fragments (i.e., instances of code clones) and report large numbers of them for industrial systems. To maintain or manage code clones, developers often have to in-vestigate differences of multiple cloned code fragments. However, existing program differencing techniques compare only two code fragments at a time. Developers then have to manually combine several pairwise differencing results. In this paper, we present an approach to automatically detecting differences across multiple clone instances. We have implemented our approach as an Eclipse plugin and evaluated its accuracy with three Java software systems. Our evaluation shows that our algorithm has precision over 97.66% and recall over 95.63 % in three open source Java projects. We also conducted a user study of 18 developers to evaluate the use-fulness of our approach for eight clone-related refactoring tasks. Our study shows that our approach can significantly improve de-velopers โ performance in refactoring decisions, refactoring details, and task completion time on clone-related refactoring tasks. Au-tomatically detecting differences across multiple clone instances also opens opportunities for building practical applications of code clones in software maintenance, such as auto-generation of appli-cation skeleton, intelligent simultaneous code editing
Comparison and Evaluation of Clone Detection Tools
Many techniques for detecting duplicated source code (software clones) have been proposed in the past. However, it is not yet clear how these techniques compare in terms of recall and precision as well as space and time requirements. This paper presents an experiment that evaluates six clone detectors based on eight large C and Java programs (altogether almost 850 KLOC). Their clone candidates were evaluated by one of the authors as an independent third party. The selected techniques cover the whole spectrum of the state-of-the-art in clone detection. The techniques work on text, lexical and syntactic information, software metrics, and program dependency graphs
Dealing with clones in software : a practical approach from detection towards management
Despite the fact that duplicated fragments of code also called code clones are considered one of the prominent code smells that may exist in software, cloning is widely practiced in industrial development. The larger the system, the more people involved in its development and the more parts developed by different teams result in an increased possibility of having cloned code in the system. While there are particular benefits of code cloning in software development, research shows that it might be a source of various troubles in evolving software. Therefore, investigating and understanding clones in a software system is important to manage the clones efficiently. However, when the system is fairly large, it is challenging to identify and manage those clones properly. Among the various types of clones that may exist in software, research shows detection of near-miss clones where there might be minor to significant differences (e.g., renaming of identifiers and additions/deletions/modifications of statements) among the cloned fragments is costly in terms of time and memory. Thus, there is a great demand of state-of-the-art technologies in dealing with clones in software.
Over the years, several tools have been developed to detect and visualize exact and similar clones. However, usually the tools are standalone and do not integrate well with a software developer's workflow. In this thesis, first, a study is presented on the effectiveness of a fingerprint based data similarity measurement technique named 'simhash' in detecting clones in large scale code-base. Based on the positive outcome of the study, a time efficient detection approach is proposed to find exact and near-miss clones in software, especially in large scale software systems. The novel detection approach has been made available as a highly configurable and fully fledged standalone clone detection tool named 'SimCad', which can be configured for detection of clones in both source code and non-source code based data. Second, we show a robust use of the clone detection approach studied earlier by assembling its detection service as a portable library named 'SimLib'. This library can provide tightly coupled (integrated) clone detection functionality to other applications as opposed to loosely coupled service provided by a typical standalone tool. Because of being highly configurable and easily extensible, this library allows the user to customize its clone detection process for detecting clones in data having diverse characteristics. We performed a user study to get some feedback on installation and use of the 'SimLib' API (Application Programming Interface) and to uncover its potential use as a third-party clone detection library. Third, we investigated on what tools and techniques are currently in use to detect and manage clones and understand their evolution. The goal was to find how those tools and techniques can be made available to a developer's own software development platform for convenient identification, tracking and management of clones in the software. Based on that, we developed a clone-aware software development platform named 'SimEclipse' to promote the practical use of code clone research and to provide better support for clone management in software. Finally, we evaluated 'SimEclipse' by conducting a user study on its effectiveness, usability and information management. We believe that both researchers and developers would enjoy and utilize the benefit of using these tools in different aspect of code clone research and manage cloned code in software systems
Survey of Research on Software Clones
This report summarizes my overview talk on software clone detection
research. It first discusses the notion of software redundancy, cloning, duplication,
and similarity. Then, it describes various categorizations of clone types, empirical
studies on the root causes for cloning, current opinions and wisdom of consequences
of cloning, empirical studies on the evolution of clones, ways to remove, to avoid,
and to detect them, empirical evaluations of existing automatic clone detector performance
(such as recall, precision, time and space consumption) and their fitness
for a particular purpose, benchmarks for clone detector evaluations, presentation
issues, and last but not least application of clone detection in other related fields.
After each summary of a subarea, I am listing open research questions
A Novel Approach for Code Clone Detection Using Hybrid Technique
Code clones have been studied for long, and there is strong evidence that they are a major source of software faults. The copying of code has been studied within software engineering mostly in the area of clone analysis. Software clones are regions of source code which are highly similar; these regions of similarity are called clones, clone classes, or clone pairs In this paper a hybrid approach using metric based technique with the combination of text based technique for detection and reporting of clones is proposed. The Proposed work is divided into two stages selection of potential clones and comparing of potential clones using textual comparison. The proposed technique detects exact clones on the basis of metric match and then by text match
Clone Detection via Structural Abstraction
This paper describes the design, implementation, and
application of a new algorithm to detect cloned code. It
operates on the abstract syntax trees formed by many compilers
as an intermediate representation. It extends prior
work by identifying clones even when arbitrary subtrees
have been changed. On a 440,000-line code corpus, 20-
50%of the clones it detected were missed by previous methods.
The method also identifies cloning in declarations, so
it is somewhat more general than conventional procedural
abstraction
- โฆ