1 research outputs found
Leveraging Historical Associations between Requirements and Source Code to Identify Impacted Classes
As new requirements are introduced and implemented in a software system,
developers must identify the set of source code classes which need to be
changed. Therefore, past effort has focused on predicting the set of classes
impacted by a requirement. In this paper, we introduce and evaluate a new type
of information based on the intuition that the set of requirements which are
associated with historical changes to a specific class are likely to exhibit
semantic similarity to new requirements which impact that class. This new
Requirements to Requirements Set (R2RS) family of metrics captures the semantic
similarity between a new requirement and the set of existing requirements
previously associated with a class. The aim of this paper is to present and
evaluate the usefulness of R2RS metrics in predicting the set of classes
impacted by a requirement. We consider 18 different R2RS metrics by combining
six natural language processing techniques to measure the semantic similarity
among texts (e.g., VSM) and three distribution scores to compute overall
similarity (e.g., average among similarity scores). We evaluate if R2RS is
useful for predicting impacted classes in combination and against four other
families of metrics that are based upon temporal locality of changes, direct
similarity to code, complexity metrics, and code smells. Our evaluation
features five classifiers and 78 releases belonging to four large open-source
projects, which result in over 700,000 candidate impacted classes. Experimental
results show that leveraging R2RS information increases the accuracy of
predicting impacted classes practically by an average of more than 60% across
the various classifiers and projects