26,258 research outputs found
Connecting Software Metrics across Versions to Predict Defects
Accurate software defect prediction could help software practitioners
allocate test resources to defect-prone modules effectively and efficiently. In
the last decades, much effort has been devoted to build accurate defect
prediction models, including developing quality defect predictors and modeling
techniques. However, current widely used defect predictors such as code metrics
and process metrics could not well describe how software modules change over
the project evolution, which we believe is important for defect prediction. In
order to deal with this problem, in this paper, we propose to use the
Historical Version Sequence of Metrics (HVSM) in continuous software versions
as defect predictors. Furthermore, we leverage Recurrent Neural Network (RNN),
a popular modeling technique, to take HVSM as the input to build software
prediction models. The experimental results show that, in most cases, the
proposed HVSM-based RNN model has a significantly better effort-aware ranking
effectiveness than the commonly used baseline models
Poster: Bridging effort-Aware prediction and strong classification: A just-in-Time software defect prediction study
Context: Most research into software defect prediction ignores the differing amount of effort entailed in searching for defects between software components. The result is sub-optimal solutions in terms of allocating testing resources. Recently effort-aware (EA) defect prediction has sought to redress this deficiency. However, there is a gap between previous classification research and EA prediction.
Objective: We seek to transfer strong defect classification capability to efficient effort-aware software defect prediction.
Method: We study the relationship between classification performance and the cost-effectiveness curve experimentally (using six open-source software data sets).
Results: We observe extremely skewed distributions of change size which contributes to the lack of relationship between classification performance and the ability to find efficient test orderings for defect detection. Trimming allows all effort-aware approaches bridging high classification capability to efficient effort-aware performance.
Conclusion: Effort distributions dominate effort-aware models. Trimming is a practical method to handle this problem
Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction
Data available at https://doi.org/10.5281/zenodo.1432582</p
Further thoughts on precision
Background: There has been much discussion amongst automated software defect prediction researchers regarding use of the precision and false positive rate classifier performance metrics. Aim: To demonstrate and explain why failing to report precision when using data with highly imbalanced class distributions may provide an overly optimistic view of classifier performance. Method: Well documented examples of how dependent class distribution affects the suitability of performance measures. Conclusions: When using data where the minority class represents less than around 5 to 10 percent of data points in total, failing to report precision may be a critical mistake. Furthermore, deriving the precision values omitted from studies can reveal valuable insight into true classifier performancePeer reviewedFinal Accepted Versio
Is One Hyperparameter Optimizer Enough?
Hyperparameter tuning is the black art of automatically finding a good
combination of control parameters for a data miner. While widely applied in
empirical Software Engineering, there has not been much discussion on which
hyperparameter tuner is best for software analytics. To address this gap in the
literature, this paper applied a range of hyperparameter optimizers (grid
search, random search, differential evolution, and Bayesian optimization) to
defect prediction problem. Surprisingly, no hyperparameter optimizer was
observed to be `best' and, for one of the two evaluation measures studied here
(F-measure), hyperparameter optimization, in 50\% cases, was no better than
using default configurations.
We conclude that hyperparameter optimization is more nuanced than previously
believed. While such optimization can certainly lead to large improvements in
the performance of classifiers used in software analytics, it remains to be
seen which specific optimizers should be applied to a new dataset.Comment: 7 pages, 2 columns, accepted for SWAN1
Transfer Learning for Improving Model Predictions in Highly Configurable Software
Modern software systems are built to be used in dynamic environments using
configuration capabilities to adapt to changes and external uncertainties. In a
self-adaptation context, we are often interested in reasoning about the
performance of the systems under different configurations. Usually, we learn a
black-box model based on real measurements to predict the performance of the
system given a specific configuration. However, as modern systems become more
complex, there are many configuration parameters that may interact and we end
up learning an exponentially large configuration space. Naturally, this does
not scale when relying on real measurements in the actual changing environment.
We propose a different solution: Instead of taking the measurements from the
real system, we learn the model using samples from other sources, such as
simulators that approximate performance of the real system at low cost. We
define a cost model that transform the traditional view of model learning into
a multi-objective problem that not only takes into account model accuracy but
also measurements effort as well. We evaluate our cost-aware transfer learning
solution using real-world configurable software including (i) a robotic system,
(ii) 3 different stream processing applications, and (iii) a NoSQL database
system. The experimental results demonstrate that our approach can achieve (a)
a high prediction accuracy, as well as (b) a high model reliability.Comment: To be published in the proceedings of the 12th International
Symposium on Software Engineering for Adaptive and Self-Managing Systems
(SEAMS'17
- ā¦