1,027 research outputs found
Is "Better Data" Better than "Better Data Miners"? (On the Benefits of Tuning SMOTE for Defect Prediction)
We report and fix an important systematic error in prior studies that ranked
classifiers for software analytics. Those studies did not (a) assess
classifiers on multiple criteria and they did not (b) study how variations in
the data affect the results. Hence, this paper applies (a) multi-criteria tests
while (b) fixing the weaker regions of the training data (using SMOTUNED, which
is a self-tuning version of SMOTE). This approach leads to dramatically large
increases in software defect predictions. When applied in a 5*5
cross-validation study for 3,681 JAVA classes (containing over a million lines
of code) from open source systems, SMOTUNED increased AUC and recall by 60% and
20% respectively. These improvements are independent of the classifier used to
predict for quality. Same kind of pattern (improvement) was observed when a
comparative analysis of SMOTE and SMOTUNED was done against the most recent
class imbalance technique. In conclusion, for software analytic tasks like
defect prediction, (1) data pre-processing can be more important than
classifier choice, (2) ranking studies are incomplete without such
pre-processing, and (3) SMOTUNED is a promising candidate for pre-processing.Comment: 10 pages + 2 references. Accepted to International Conference of
Software Engineering (ICSE), 201
Is "Better Data" Better than "Better Data Miners"? (On the Benefits of Tuning SMOTE for Defect Prediction)
We report and fix an important systematic error in prior studies that ranked
classifiers for software analytics. Those studies did not (a) assess
classifiers on multiple criteria and they did not (b) study how variations in
the data affect the results. Hence, this paper applies (a) multi-criteria tests
while (b) fixing the weaker regions of the training data (using SMOTUNED, which
is a self-tuning version of SMOTE). This approach leads to dramatically large
increases in software defect predictions. When applied in a 5*5
cross-validation study for 3,681 JAVA classes (containing over a million lines
of code) from open source systems, SMOTUNED increased AUC and recall by 60% and
20% respectively. These improvements are independent of the classifier used to
predict for quality. Same kind of pattern (improvement) was observed when a
comparative analysis of SMOTE and SMOTUNED was done against the most recent
class imbalance technique. In conclusion, for software analytic tasks like
defect prediction, (1) data pre-processing can be more important than
classifier choice, (2) ranking studies are incomplete without such
pre-processing, and (3) SMOTUNED is a promising candidate for pre-processing.Comment: 10 pages + 2 references. Accepted to International Conference of
Software Engineering (ICSE), 201
A DEEP ENSEMBLE LEARNING METHOD FOR EFFORT-AWARE JUST-IN-TIME DEFECT PREDICTION
Nowadays, logistics for transportation and distribution of merchandise are a key element to increase the competitiveness of companies. However, the election of alternative routes outside the panned routes causes the logistic companies to provide a poor-quality service, with units that endanger the appropriate deliver of merchandise and impacting negatively the way in which the supply chain works. This paper aims to develop a module that allows the processing, analysis and deployment of satellite information oriented to the pattern analysis, to find anomalies in the paths of the operators by implementing the algorithm TODS, to be able to help in the decision making. The experimental results show that the algorithm detects optimally the abnormal routes using historical data as a base
Connecting Software Metrics across Versions to Predict Defects
Accurate software defect prediction could help software practitioners
allocate test resources to defect-prone modules effectively and efficiently. In
the last decades, much effort has been devoted to build accurate defect
prediction models, including developing quality defect predictors and modeling
techniques. However, current widely used defect predictors such as code metrics
and process metrics could not well describe how software modules change over
the project evolution, which we believe is important for defect prediction. In
order to deal with this problem, in this paper, we propose to use the
Historical Version Sequence of Metrics (HVSM) in continuous software versions
as defect predictors. Furthermore, we leverage Recurrent Neural Network (RNN),
a popular modeling technique, to take HVSM as the input to build software
prediction models. The experimental results show that, in most cases, the
proposed HVSM-based RNN model has a significantly better effort-aware ranking
effectiveness than the commonly used baseline models
When Less is More: On the Value of "Co-training" for Semi-Supervised Software Defect Predictors
Labeling a module defective or non-defective is an expensive task. Hence,
there are often limits on how much-labeled data is available for training.
Semi-supervised classifiers use far fewer labels for training models. However,
there are numerous semi-supervised methods, including self-labeling,
co-training, maximal-margin, and graph-based methods, to name a few. Only a
handful of these methods have been tested in SE for (e.g.) predicting defects
and even there, those methods have been tested on just a handful of projects.
This paper applies a wide range of 55 semi-supervised learners to over 714
projects. We find that semi-supervised "co-training methods" work significantly
better than other approaches. Specifically, after labeling, just
2.5% of data, then make predictions that are competitive to those using 100%
of the data.
That said, co-training needs to be used cautiously since the specific choice
of co-training methods needs to be carefully selected based on a user's
specific goals. Also, we warn that a commonly-used co-training method
("multi-view"-- where different learners get different sets of columns) does
not improve predictions (while adding too much to the run time costs 11 hours
vs. 1.8 hours).
It is an open question, worthy of future work, to test if these reductions
can be seen in other areas of software analytics. To assist with exploring
other areas, all the codes used are available at
https://github.com/ai-se/Semi-Supervised.Comment: 36 pages, 10 figures, 5 table
500+ Times Faster Than Deep Learning (A Case Study Exploring Faster Methods for Text Mining StackOverflow)
Deep learning methods are useful for high-dimensional data and are becoming
widely used in many areas of software engineering. Deep learners utilizes
extensive computational power and can take a long time to train-- making it
difficult to widely validate and repeat and improve their results. Further,
they are not the best solution in all domains. For example, recent results show
that for finding related Stack Overflow posts, a tuned SVM performs similarly
to a deep learner, but is significantly faster to train. This paper extends
that recent result by clustering the dataset, then tuning very learners within
each cluster. This approach is over 500 times faster than deep learning (and
over 900 times faster if we use all the cores on a standard laptop computer).
Significantly, this faster approach generates classifiers nearly as good
(within 2\% F1 Score) as the much slower deep learning method. Hence we
recommend this faster methods since it is much easier to reproduce and utilizes
far fewer CPU resources. More generally, we recommend that before researchers
release research results, that they compare their supposedly sophisticated
methods against simpler alternatives (e.g applying simpler learners to build
local models)
Effort-aware just-in-time defect identification in practice: A case study at Alibaba
National Research Foundation (NRF) Singapore under its AI Singapore Programm
JITO: A tool for just-in-time defect identification and localization
Australian Research Counci
- …