Search CORE

102 research outputs found

An implementation research on software defect prediction using machine learning techniques

Author: Pulliainen Laur
Publication venue: Helsingfors universitet
Publication date: 01/01/2018
Field of study

Software defect prediction is the process of improving software testing process by identifying defects in the software. It is accomplished by using supervised machine learning with software metrics and defect data as variables. While the theory behind software defect prediction has been validated in previous studies, it has not widely been implemented into practice. In this thesis, a software defect prediction framework is implemented for improving testing process resource allocation and software release time optimization at RELEX Solutions. For this purpose, code and change metrics are collected from RELEX software. The used metrics are selected with the criteria of their frequency of usage in other software defect prediction studies, and availability of the metric in metric collection tools. In addition to metric data, defect data is collected from issue tracker. Then, a framework for classifying the collected data is implemented and experimented on. The framework leverages existing machine learning algorithm libraries to provide classification functionality, using classifiers which are found to perform well in similar software defect prediction experiments. The results from classification are validated utilizing commonly used classifier performance metrics, in addition to which the suitability of the predictions is verified from a use case point of view. It is found that software defect prediction does work in practice, with the implementation achieving comparable results to other similar studies when measuring by classifier performance metrics. When validating against the defined use cases, the performance is found acceptable, however the performance varies between different data sets. It is thus concluded that while results are tentatively positive, further monitoring with future software versions is needed to verify performance and reliability of the framework

Helsingin yliopiston digitaalinen arkisto

The impact of parameter optimization of ensemble learning on defect prediction

Author: Muhammed Maruf Ozturk
Publication venue: Institute of Mathematics and Computer Science of the Academy of Sciences of Moldova
Publication date: 01/05/2019
Field of study

Machine learning algorithms have configurable parameters which are generally used with default settings by practitioners. Making modifications on the parameters of machine learning algorithm is called hyperparameter optimization (HO) performed to find out the most suitable parameter setting in classification experiments. Such studies propose either using default classification model or optimal parameter configuration. This work investigates the effects of applying HO on ensemble learning algorithms in terms of defect prediction performance. Further, this paper presents a new ensemble learning algorithm called novelEnsemble for defect prediction data sets. The method has been tested on 27 data sets. Proposed method is then compared with three alternatives. Welch's Heteroscedastic F Test is used to examine the difference between performance parameters. To control the magnitude of the difference, Cliff's Delta is applied on the results of comparison algorithms. According to the results of the experiment: 1) Ensemble methods featuring HO performs better than a single predictor; 2) Despite the error of triTraining decreases linearly, it produces errors at an unacceptable level; 3) novelEnsemble yields promising results especially in terms of area under the curve (AUC) and Matthews Correlation Coefficient (MCC); 4) HO is not stagnant depending on the scale of the data set; 5) Each ensemble learning approach may not create a favorable effect on HO. To demonstrate the prominence of hyperparameter selection process, the experiment is validated with suitable statistical analyzes. The study revealed that the success of HO which is, contrary to expectations, not depended on the type of the classifiers but rather on the design of ensemble learners

Directory of Open Access Journals

Statistical and deep learning models for software engineering corpora

Author: HOANG Van Duc Thong
Publication venue: Singapore Management University
Publication date: 01/08/2020
Field of study

Institutional Knowledge at Singapore Management University

Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack

Author: Imran Naveed
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2013
Field of study

Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Improving Defect Prediction Models by Combining Classifiers Predicting Different Defects

Author: Petric Jean
Publication venue
Publication date: 02/10/2018
Field of study

Background: The software industry spends a lot of money on finding and fixing defects. It utilises software defect prediction models to identify code that is likely to be defective. Prediction models have, however, reached a performance bottleneck. Any improvements to prediction models would likely yield less defects-reducing costs for companies. Aim: In this dissertation I demonstrate that different families of classifiers find distinct subsets of defects. I show how this finding can be utilised to design ensemble models which outperform other state-of-the-art software defect prediction models. Method: This dissertation is supported by published work. In the first paper I explore the quality of data which is a prerequisite for building reliable software defect prediction models. The second and third papers explore the ability of different software defect prediction models to find distinct subsets of defects. The fourth paper explores how software defect prediction models can be improved by combining a collection of classifiers that predict different defective components into ensembles. An additional, non-published work, presents a visual technique for the analysis of predictions made by individual classifiers and discusses some possible constraints for classifiers used in software defect prediction. Result: Software defect prediction models created by classifiers of different families predict distinct subsets of defects. Ensembles composed of classifiers belonging to different families outperform other ensemble and standalone models. Only a few highly diverse and accurate base models are needed to compose an effective ensemble. This ensemble can consistently predict a greater number of defects compared to the increase in incorrect predictions. Conclusion: Ensembles should not use the majority-voting techniques to combine decisions of classifiers in software defect prediction as this will miss correct predictions of classifiers which uniquely identify defects. Some classifiers could be less successful for software defect prediction due to complex decision boundaries of defect data. Stacking based ensembles can outperform other ensemble and stand-alone techniques. I propose new possible avenues of research that could further improve the modelling of ensembles in software defect prediction. Data quality should be explicitly considered prior to experiments for researchers to establish reliable results

University of Hertfordshire Research Archive

Integrative multi-omic network strategies for unraveling complex disease biology and the identification of novel phenotype associated genes

Author: Lancour Daniel
Publication venue
Publication date: 28/02/2020
Field of study

Identifying the genetic risk factors underlying a given disease is an essential step for informing effective drug targets, understanding disease architecture, and predicting at-risk individuals. A commonly applied approach for identifying novel disease-associated genes is the Genome Wide Association Study (GWAS) approach, in which a high number of individuals are sequenced and genetic variants are then tested for an association with disease status. While the GWAS approach has identified countless disease-associated genes, there remain plenty of diseases for which our genetic understanding is still incomplete. One strategy for augmenting the GWAS approach is to incorporate additional omics data in order to prioritize biologically plausible candidate genes. In this thesis work, we integrate network-based strategies with existing genetic analysis pipelines in order to identify novel Alzheimer’s disease (AD) genes. Two types of biological data inform the underlying structure of the networks: a) protein-protein interactions and b) gene expression in the human brain. Genes which interact or are co-expressed across similar conditions have been shown to have a higher probability of being functionally related. Using a set or previously known AD genes, we apply a network propagation strategy to score genes based upon their proximity to the known AD genes within these networks. Then we integrate the network score of each gene with its risk score from GWAS to identify novel candidates. To further affirm the reproducibility of findings, we further incorporate additional information in the form of knockout models in flies, bootstrap aggregation, and external genetic datasets. In addition to predicting novel genes, we are able to utilize regional co-expression networks to further understand how the known AD genes behave within the various sub-divisions of the brain. We find that regions of the brain which are known to have the earliest vulnerability to AD-induced neurodegeneration also tend to be where AD genes are highly correlated

Boston University Institutional Repository (OpenBU)

A NOVEL APPROACH FOR DETECTION FAULT IN THE AIRCRAFT EXTERIOR BODY USING IMAGE PROCESSING

Author: Almansoori Noura Nayef
Publication venue: Scholarworks@UAEU
Publication date: 01/04/2023
Field of study

The primary objective of this thesis is to develop innovative techniques for the inspection and maintenance of aircraft structures. We aim to streamline the entire process by utilizing images to detect potential defects in the aircraft body and comparing them to properly functioning images of the aircraft. This enables us to determine whether a specific section of the aircraft is faulty or not. We achieve this by employing image processing to train a model capable of identifying faulty images. The image processing methodology we use involves the use of images of both defective and operational parts of the aircraft\u27s exterior. These images undergo a preprocessing phase that preserves valuable details. During the training period, a new image of the same section of the aircraft is used to validate the model. After processing, the algorithm grades the image as faulty or normal. To facilitate our study, we rely on the Convolutional Neural Network (CNN) approach. This technique collects distinguishing features from a single patch created by the frame segmentation of a CNN kernel. Furthermore, we use various filters to process the images using the image processing toolbox available in Python. In our initial trials, we observed that the CNN model struggled with the overfitting of the faulty class. To address this, we applied image augmentation by converting a small dataset of 87 images to an augmented dataset of 4000 images. After passing the data through multiple convolutional layers and executing multiple epochs, our proposed model achieved an impressive training accuracy of 98.28%. In addition, we designed a GUI-based interface that allows users to input an image and view the results in terms of faulty or normal. Finally, we propose that the application of this research in the field of robotics would be an ideal area for future work

United Arab Emirates University: Scholarworks@UAEU / جامعة الامارات