4,918 research outputs found
A Hybrid Multi-Filter Wrapper Feature Selection Method for Software Defect Predictors
Software Defect Prediction (SDP) is an approach used for identifying defect-prone software modules or components. It helps software engineer to optimally, allocate limited resources to defective software modules or components in the testing or maintenance phases of software development life cycle (SDLC). Nonetheless, the predictive performance of SDP models reckons largely on the quality of dataset utilized for training the predictive models. The high dimensionality of software metric features has been noted as a data quality problem which negatively affects the predictive performance of SDP models. Feature Selection (FS) is a well-known method for solving high dimensionality problem and can be divided into filter-based and wrapper-based methods. Filter-based FS has low computational cost, but the predictive performance of its classification algorithm on the filtered data cannot be guaranteed. On the contrary, wrapper-based FS have good predictive performance but with high computational cost and lack of generalizability. Therefore, this study proposes a hybrid multi-filter wrapper method for feature selection of relevant and irredundant features in software defect prediction. The proposed hybrid feature selection will be developed to take advantage of filter-filter and filter-wrapper relationships to give optimal feature subsets, reduce its evaluation cycle and subsequently improve SDP models overall predictive performance in terms of Accuracy, Precision and Recall values
DBBRBF- Convalesce optimization for software defect prediction problem using hybrid distribution base balance instance selection and radial basis Function classifier
Software is becoming an indigenous part of human life with the rapid
development of software engineering, demands the software to be most reliable.
The reliability check can be done by efficient software testing methods using
historical software prediction data for development of a quality software
system. Machine Learning plays a vital role in optimizing the prediction of
defect-prone modules in real life software for its effectiveness. The software
defect prediction data has class imbalance problem with a low ratio of
defective class to non-defective class, urges an efficient machine learning
classification technique which otherwise degrades the performance of the
classification. To alleviate this problem, this paper introduces a novel hybrid
instance-based classification by combining distribution base balance based
instance selection and radial basis function neural network classifier model
(DBBRBF) to obtain the best prediction in comparison to the existing research.
Class imbalanced data sets of NASA, Promise and Softlab were used for the
experimental analysis. The experimental results in terms of Accuracy,
F-measure, AUC, Recall, Precision, and Balance show the effectiveness of the
proposed approach. Finally, Statistical significance tests are carried out to
understand the suitability of the proposed model.Comment: 32 pages, 24 Tables, 8 Figures
Cross-Lingual Adaptation for Type Inference
Deep learning-based techniques have been widely applied to the program
analysis tasks, in fields such as type inference, fault localization, and code
summarization. Hitherto deep learning-based software engineering systems rely
thoroughly on supervised learning approaches, which require laborious manual
effort to collect and label a prohibitively large amount of data. However, most
Turing-complete imperative languages share similar control- and data-flow
structures, which make it possible to transfer knowledge learned from one
language to another. In this paper, we propose cross-lingual adaptation of
program analysis, which allows us to leverage prior knowledge learned from the
labeled dataset of one language and transfer it to the others. Specifically, we
implemented a cross-lingual adaptation framework, PLATO, to transfer a deep
learning-based type inference procedure across weakly typed languages, e.g.,
Python to JavaScript and vice versa. PLATO incorporates a novel joint graph
kernelized attention based on abstract syntax tree and control flow graph, and
applies anchor word augmentation across different languages. Besides, by
leveraging data from strongly typed languages, PLATO improves the perplexity of
the backbone cross-programming-language model and the performance of downstream
cross-lingual transfer for type inference. Experimental results illustrate that
our framework significantly improves the transferability over the baseline
method by a large margin
- …