9 research outputs found
Structured Review of the Evidence for Effects of Code Duplication on Software Quality
This report presents the detailed steps and results of a structured review of code clone literature. The aim of the review is to investigate the evidence for the claim that code duplication has a negative effect on code changeability. This report contains only the details of the review for which there is not enough place to include them in the companion paper published at a conference (Hordijk, Ponisio et al. 2009 - Harmfulness of Code Duplication - A Structured Review of the Evidence)
An Extended Stable Marriage Problem Algorithm for Clone Detection
Code cloning negatively affects industrial software and threatens
intellectual property. This paper presents a novel approach to detecting cloned
software by using a bijective matching technique. The proposed approach focuses
on increasing the range of similarity measures and thus enhancing the precision
of the detection. This is achieved by extending a well-known stable-marriage
problem (SMP) and demonstrating how matches between code fragments of different
files can be expressed. A prototype of the proposed approach is provided using
a proper scenario, which shows a noticeable improvement in several features of
clone detection such as scalability and accuracy.Comment: 20 pages, 10 figures, 6 table
Stable Marriage Problem Based Adaptation for Clone Detection and Service Selection
Current software engineering topics such as clone detection and service selection need to
improve the capability of detection process and selection process. The clone detection is the
process of finding duplicated code through the system for several purposes such as removal
of repeated portions as maintenance part of legacy system. Service selection is the process of
finding the appropriate web service which meets the consumer’s request. Both problems can
be converted into a matching problem.
Matching process forms an essential part of software engineering activities. In this
research, a well-known mathematical algorithm Stable Marriage Problem (SMP) and its
variations are investigated to fulfil the purposes of matching processes in software engineering
area. We aim to provide a competitive matching algorithm that can help to detect cloned
software accurately and ensure high scalability, precision and recall. We also aim to apply
matching algorithm on incoming request and service profile to deal with the web service as
a clever independent object so that we can allow the services to accept or decline requests
(equal opportunity) rather than the current state of service selection (search-based), in which
service lacks of interacting as an independent candidate.
In order to meet the above aims, the traditional SMP algorithm has been extended to
achieve the cardinality of many-to-many. This adaptation is achieved by defining the selective
strategy which is the main engine of the new adaptations. Two adaptations, Dual-Proposed
and Dual-Multi-Allocation, have been proposed to both service selection and clone detection
process. The proposed approach (SMP-based) shows very competitive results compare
to existing software clone approaches, especially in identifying type 3 (copy with further
modifications such update, add and delete statements) of cloned software. It performs the
detection process with a relatively high precision and recall compare to the CloneDR tool
and shows good scalability on a middle sized program. For service selection, the proposed
approach has several advantages such as service protection and service quality. The services
gain equal opportunity against the incoming requests. Therefore, the intelligent service
interaction is achieved, and both stability and satisfaction of the candidates are ensured.
This dissertation contributes to several contributions firstly, the new extended SMP algorithm
by introducing selective strategy to accommodate many-to-many matching problems,
to improve overall features. Secondly, a new SMP-based clone detection approach to detect
cloned software accurately and ensures high precision and recall. Ultimately, a new SMPbased
service selection approach allows equal opportunity between services and requests.
This led to improve service protection and service quality.
Case studies are carried out for experiments with the proposed approach, which show
that the new adaptations can be applied effectively to clone detection and service selection
processes with several features (e.g. accuracy). It can be concluded that the match based
approach is feasible and promising in software engineering domain.Royal Embassy of Saudi Arabi
Detecting Test Clones with Static Analysis
Large-scale software systems often have correspondingly complicated test suites, which are diffi cult for developers to construct and maintain. As systems evolve, engineers must update their test suite along with changes in the source code. Tests created by duplicating and modifying previously existing tests (clones) can complicate this task.
Several testing technologies have been proposed to mitigate cloning in tests, including parametrized unit tests and test theories. However, detecting opportunities to improve existing test suites is labour intensive.
This thesis presents a novel technique for etecting similar tests based on type hierarchies and method calls in test code. Using this technique, we can track variable history and detect test clones based on test assertion similarity.
The thesis further includes results from our empirical study of 10 benchmark systems using this technique which suggest that test clone detection by our technique will aid test
de-duplication eff orts in industrial systems
Detecting Test Clones with Static Analysis
Large-scale software systems often have correspondingly complicated test suites, which are diffi cult for developers to construct and maintain. As systems evolve, engineers must update their test suite along with changes in the source code. Tests created by duplicating and modifying previously existing tests (clones) can complicate this task.
Several testing technologies have been proposed to mitigate cloning in tests, including parametrized unit tests and test theories. However, detecting opportunities to improve existing test suites is labour intensive.
This thesis presents a novel technique for etecting similar tests based on type hierarchies and method calls in test code. Using this technique, we can track variable history and detect test clones based on test assertion similarity.
The thesis further includes results from our empirical study of 10 benchmark systems using this technique which suggest that test clone detection by our technique will aid test
de-duplication eff orts in industrial systems
Exploiting similarity patterns in web applications for enhanced genericity and maintainability
Ph.DDOCTOR OF PHILOSOPH
Dependence Communities in Source Code
Dependence between components in natural systems is a well studied phenomenon in the form of biological and social networks. The concept of community structure arises from the analysis of social networks and has successfully been applied to complex networks in other fields such as biology, physics and computing.
We provide empirical evidence that dependence between statements in source code gives rise to community structure. This leads to the introduction of the concept of dependence communities in software and we provide evidence that they reflect the semantic concerns of a program.
Current definitions of sliced-based cohesion and coupling metrics are not defined for procedures which do not have clearly defined output variables and definitions of output variable vary from study-to-study. We solve these problems by introducing corresponding new, more efficient forms of slice-based metrics in terms of maximal slices. We show that there is a strong correlation between these new metrics and the old metrics computed using output variables.
We conduct an investigation into dependence clusters which are closely related to dependence communities. We undertake an empirical study using definitions of dependence clusters from previous studies and show that, while programs do contain large dependence clusters, over 75% of these are not ‘true’ dependence clusters.
We bring together the main elements of the thesis in a study of software quality, investigating their interrelated nature. We show that procedures that are members of multiple communities have a low cohesion, programs with higher coupling have larger dependence communities, programs with large dependence clusters also have large dependence communities and programs with high modularity have low coupling.
Dependence communities and maximal-slice-based metrics have a huge number of potential applications including program comprehension, maintenance, debugging, refactoring, testing and software protection
Are Decomposition Slices Clones
When computing program slices on all variables in a system, we observed that many of these slices are the same. This leads to the question: Are we looking at software clones? We discuss the genesis of this phenomena and present some of the data observations that led to the question. The answer to our query is not immediately clear. We end by presenting arguments both pro and con. Supporting the affirmative, we observed that some slice-clones are evidently the result of the usual genesis of software clones: failure to note appropriate abstractions. Also, slice-clones assist in program comprehension by coalescing into one program fragment the computations on many different variables. Opposing the proposition, we note that slice-clones do not arise due to programmer intent or the copying of existing idioms