120,021 research outputs found
Is "Better Data" Better than "Better Data Miners"? (On the Benefits of Tuning SMOTE for Defect Prediction)
We report and fix an important systematic error in prior studies that ranked
classifiers for software analytics. Those studies did not (a) assess
classifiers on multiple criteria and they did not (b) study how variations in
the data affect the results. Hence, this paper applies (a) multi-criteria tests
while (b) fixing the weaker regions of the training data (using SMOTUNED, which
is a self-tuning version of SMOTE). This approach leads to dramatically large
increases in software defect predictions. When applied in a 5*5
cross-validation study for 3,681 JAVA classes (containing over a million lines
of code) from open source systems, SMOTUNED increased AUC and recall by 60% and
20% respectively. These improvements are independent of the classifier used to
predict for quality. Same kind of pattern (improvement) was observed when a
comparative analysis of SMOTE and SMOTUNED was done against the most recent
class imbalance technique. In conclusion, for software analytic tasks like
defect prediction, (1) data pre-processing can be more important than
classifier choice, (2) ranking studies are incomplete without such
pre-processing, and (3) SMOTUNED is a promising candidate for pre-processing.Comment: 10 pages + 2 references. Accepted to International Conference of
Software Engineering (ICSE), 201
A Statistical Modeling Approach to Computer-Aided Quantification of Dental Biofilm
Biofilm is a formation of microbial material on tooth substrata. Several
methods to quantify dental biofilm coverage have recently been reported in the
literature, but at best they provide a semi-automated approach to
quantification with significant input from a human grader that comes with the
graders bias of what are foreground, background, biofilm, and tooth.
Additionally, human assessment indices limit the resolution of the
quantification scale; most commercial scales use five levels of quantification
for biofilm coverage (0%, 25%, 50%, 75%, and 100%). On the other hand, current
state-of-the-art techniques in automatic plaque quantification fail to make
their way into practical applications owing to their inability to incorporate
human input to handle misclassifications. This paper proposes a new interactive
method for biofilm quantification in Quantitative light-induced fluorescence
(QLF) images of canine teeth that is independent of the perceptual bias of the
grader. The method partitions a QLF image into segments of uniform texture and
intensity called superpixels; every superpixel is statistically modeled as a
realization of a single 2D Gaussian Markov random field (GMRF) whose parameters
are estimated; the superpixel is then assigned to one of three classes
(background, biofilm, tooth substratum) based on the training set of data. The
quantification results show a high degree of consistency and precision. At the
same time, the proposed method gives pathologists full control to post-process
the automatic quantification by flipping misclassified superpixels to a
different state (background, tooth, biofilm) with a single click, providing
greater usability than simply marking the boundaries of biofilm and tooth as
done by current state-of-the-art methods.Comment: 10 pages, 7 figures, Journal of Biomedical and Health Informatics
2014. keywords: {Biomedical imaging;Calibration;Dentistry;Estimation;Image
segmentation;Manuals;Teeth},
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6758338&isnumber=636350
Reliability and validity in comparative studies of software prediction models
Empirical studies on software prediction models do not converge with respect to the question "which prediction model is best?" The reason for this lack of convergence is poorly understood. In this simulation study, we have examined a frequently used research procedure comprising three main ingredients: a single data sample, an accuracy indicator, and cross validation. Typically, these empirical studies compare a machine learning model with a regression model. In our study, we use simulation and compare a machine learning and a regression model. The results suggest that it is the research procedure itself that is unreliable. This lack of reliability may strongly contribute to the lack of convergence. Our findings thus cast some doubt on the conclusions of any study of competing software prediction models that used this research procedure as a basis of model comparison. Thus, we need to develop more reliable research procedures before we can have confidence in the conclusions of comparative studies of software prediction models
Is "Better Data" Better than "Better Data Miners"? (On the Benefits of Tuning SMOTE for Defect Prediction)
We report and fix an important systematic error in prior studies that ranked
classifiers for software analytics. Those studies did not (a) assess
classifiers on multiple criteria and they did not (b) study how variations in
the data affect the results. Hence, this paper applies (a) multi-criteria tests
while (b) fixing the weaker regions of the training data (using SMOTUNED, which
is a self-tuning version of SMOTE). This approach leads to dramatically large
increases in software defect predictions. When applied in a 5*5
cross-validation study for 3,681 JAVA classes (containing over a million lines
of code) from open source systems, SMOTUNED increased AUC and recall by 60% and
20% respectively. These improvements are independent of the classifier used to
predict for quality. Same kind of pattern (improvement) was observed when a
comparative analysis of SMOTE and SMOTUNED was done against the most recent
class imbalance technique. In conclusion, for software analytic tasks like
defect prediction, (1) data pre-processing can be more important than
classifier choice, (2) ranking studies are incomplete without such
pre-processing, and (3) SMOTUNED is a promising candidate for pre-processing.Comment: 10 pages + 2 references. Accepted to International Conference of
Software Engineering (ICSE), 201
Towards Identifying and closing Gaps in Assurance of autonomous Road vehicleS - a collection of Technical Notes Part 1
This report provides an introduction and overview of the Technical Topic Notes (TTNs) produced in the Towards Identifying and closing Gaps in Assurance of autonomous Road vehicleS (Tigars) project. These notes aim to support the development and evaluation of autonomous vehicles. Part 1 addresses: Assurance-overview and issues, Resilience and Safety Requirements, Open Systems Perspective and Formal Verification and Static Analysis of ML Systems. Part 2: Simulation and Dynamic Testing, Defence in Depth and Diversity, Security-Informed Safety Analysis, Standards and Guidelines
- …