11 research outputs found

    Identification, Analysis & Empirical Validation (IAV) of Object Oriented Design (OO) Metrics as Quality Indicators

    Get PDF
    Metrics and Measure are closely inter-related to each other. Measure is defined as way of defining amount, dimension, capacity or size of some attribute of a product in quantitative manner while Metric is unit used for measuring attribute. Software quality is one of the major concerns that need to be addressed and measured. Object oriented (OO) systems require effective metrics to assess quality of software. The paper is designed to identify attributes and measures that can help in determining and affecting quality attributes. The paper conducts empirical study by taking public dataset KC1 from NASA project database. It is validated by applying statistical techniques like correlation analysis and regression analysis. After analysis of data, it is found that metrics SLOC, RFC, WMC and CBO are significant and treated as quality indicators while metrics DIT and NOC are not significant. The results produced from them throws significant impact on improving software quality


    Get PDF
    Kualitas software sudah menjadi bagian yang penting dalam proses pengembangan. Karena semakin kompleksnya sebuah software  dan tingginya ekspektasi dari pelanggan. Maka saat ini biaya pengembangan software juga semakin tinggi. Oleh karena itu dibutuhkan efisiensi untuk menekan biaya pengembangan software. Salah satu cara yang bisa dilakukan yaitu dengan software defect prediction. Dengan sotware defect prediction maka dapat diketahui proyek software mana yang butuh pengecekan lebih intens. Tim test software dapat mengalokasikan waktu dan biaya lebih efektif berdasarkan hasil dari model algoritma. Metode pada riset ini menggunakan prepocessing dengan mengoptimalkan bobot atribut dengan menggunakan metode PSO yang merupakan algoritma pencarian berbasis populasi dan yang diinisialisi dengan populasi solusi acak yang disebut partikel. Berdasarkan hasil pengolaha data dengan metode preprocessing terhadap dataset NASA MDP CM1. Maka didapatkan  metode preprocessing dengan pembobotan atribut dengan metode PSO memiliki peningkatan akurasi menjadi 86.37% dari sebelumnya 85.54% dan AUC menjadi 0.827 dari sebelumnya 0.762.Kata kunci: prediksi cacat software, linier regression, feature selection, optimize weight

    Toward Non-security Failures as a Predictor of Security Faults and Failures

    Full text link
    Abstract. In the search for metrics that can predict the presence of vulnerabilities early in the software life cycle, there may be some benefit to choosing metrics from the non-security realm. We analyzed non-security and security failure data reported for the year 2007 of a Cisco software system. We used non-security failure reports as input variables into a classification and regression tree (CART) model to determine the probability that a component will have at least one vulnerability. Using CART, we ranked all of the system components in descending order of their probabilities and found that 57 % of the vulnerable components were in the top nine percent of the total component ranking, but with a 48 % false positive rate. The results indicate that non-security failures can be used as one of the input variables for security-related prediction models

    Adaptive Genetic Algorithm Based Artificial Neural Network for Software Defect Prediction

    Get PDF
    To meet the requirement of an efficient software defect prediction,in this paper an evolutionary computing based neural network learning scheme has been developed that alleviates the existing Artificial Neural Network (ANN) limitations such as local minima and convergence issues. To achieve optimal software defect prediction, in this paper, Adaptive-Genetic Algorithm (A-GA) based ANN learning and weightestimation scheme has been developed. Unlike conventional GA, in this paper we have used adaptive crossover and mutation probability parameter that alleviates the issue of disruption towards optimal solution. We have used object oriented software metrics, CK metrics for fault prediction and the proposed Evolutionary Computing Based Hybrid Neural Network (HENN)algorithm has been examined for performance in terms of accuracy, precision, recall, F-measure, completeness etc, where it has performed better as compared to major existing schemes. The proposed scheme exhibited 97.99% prediction accuracy while ensuring optimal precision, Fmeasure and recall

    Evolutionary Computing based an Efficient and Cost Effective Software Defect Prediction System

    Get PDF
    The earlier defect prediction and fault removal can play a vital role in ensuring software reliability and quality of service In this paper Hybrid Evolutionary computing based Neural Network HENN based software defect prediction model has been developed For HENN an adaptive genetic algorithm A-GA has been developed that alleviates the key existing limitations like local minima and convergence Furthermore the implementation of A-GA enables adaptive crossover and mutation probability selection that strengthens computational efficiency of our proposed system The proposed HENN algorithm has been used for adaptive weight estimation and learning optimization in ANN for defect prediction In addition a novel defect prediction and fault removal cost estimation model has been derived to evaluate the cost effectiveness of the proposed system The simulation results obtained for PROMISE and NASA MDP datasets exhibit the proposed model outperforms Levenberg Marquardt based ANN system LM-ANN and other systems as well And also cost analysis exhibits that the proposed HENN model is approximate 21 66 cost effective as compared to LM-AN

    Sampling imbalance dataset for software defect prediction using hybrid neuro-fuzzy systems with Naive Bayes classifier

    Get PDF
    Predviđanje grešaka u računalnom programu (SDP-software defect prediction) je težak zadatak kad se radi o projektima računalnog programa. Taj je postupak koristan za identifikaciju i lokaciju neispravnosti iz modula. Taj će zadatak postati skuplji uz dodatak složenih mehanizama za ispitivanje i ocjenjivanje kad se poveća veličina modula programa. Daljnje konsistentne i disciplinirane provjere programa nude nekoliko prednosti, na pr. točnost u procjeni troškova i programiranja projekta, povećanje kvalitete postupka i proizvoda. Detaljna analiza metričkih podataka programa također može značajno pomoći u lociranju mogućih grešaka u programskom kodiranju. Osnovni je cilj ovoga rada predstaviti metode za detekciju i otkrivanje grešaka u programu primjenom postupaka strojnog učenja. U radu su korišteni nebalansirani nizovi podaka iz NASA-inog Metrics Data Programa (MDP) i programska metrika niza podataka izabrana je primjenom Genetičkog algoritma metodom Optimizacije kolonije mrava (Ant Colony Optimization -GACO). Postupak uzorkovanja metodom Modified Co Forest - polu-nadgledanog učenja, generira balansirano označene nizove podataka koristeći nebalansirane nizove, a primjenjuje se za učinkoviti postupak otkrivanja greške u programu s Hibridnim Neuro-Fuzzy sustavima za strojno učenje po Naive Bayes metodama. Eksperimentalni rezultati predložene metode dokazuju da je ova metoda za otkrivanje greške u računalnom program učinkovitija od drugih postojećih metoda, s boljim rezultatima u predviđanju greške.Software defect prediction (SDP) is a process with difficult tasks in the case of software projects. The SDP process is useful for the identification and location of defects from the modules. This task will tend to become more costly with the addition of complex testing and evaluation mechanisms, when the software project modules size increases. Further measurement of software in a consistent and disciplined manner offers several advantages like accuracy in the estimation of project costs and schedules, and improving product and process qualities. Detailed analysis of software metric data also gives significant clues about the locations of possible defects in a programming code. The main goal of this proposed work is to introduce software defects detection and prevention methods for identifying defects from software using machine learning approaches. This proposed work used imbalanced datasets from NASA’s Metrics Data Program (MDP) and software metrics of datasets are selected by using Genetic algorithm with Ant Colony Optimization (GACO) method. The sampling process with semi supervised learning Modified Co Forest method generates the balanced labelled using imbalanced datasets, which is used for efficient software defect detection process with machine learning Hybrid Neuro-Fuzzy Systems with Naive Bayes methods. The experimental results of this proposed method proves that this defect detecting machine learning method yields more efficiency and better performance in defect prediction result of software in comparison with the other available methods

    A Fault-Based Model of Fault Localization Techniques

    Get PDF
    Every day, ordinary people depend on software working properly. We take it for granted; from banking software, to railroad switching software, to flight control software, to software that controls medical devices such as pacemakers or even gas pumps, our lives are touched by software that we expect to work. It is well known that the main technique/activity used to ensure the quality of software is testing. Often it is the only quality assurance activity undertaken, making it that much more important. In a typical experiment studying these techniques, a researcher will intentionally seed a fault (intentionally breaking the functionality of some source code) with the hopes that the automated techniques under study will be able to identify the fault\u27s location in the source code. These faults are picked arbitrarily; there is potential for bias in the selection of the faults. Previous researchers have established an ontology for understanding or expressing this bias called fault size. This research captures the fault size ontology in the form of a probabilistic model. The results of applying this model to measure fault size suggest that many faults generated through program mutation (the systematic replacement of source code operators to create faults) are very large and easily found. Secondary measures generated in the assessment of the model suggest a new static analysis method, called testability, for predicting the likelihood that code will contain a fault in the future. While software testing researchers are not statisticians, they nonetheless make extensive use of statistics in their experiments to assess fault localization techniques. Researchers often select their statistical techniques without justification. This is a very worrisome situation because it can lead to incorrect conclusions about the significance of research. This research introduces an algorithm, MeansTest, which helps automate some aspects of the selection of appropriate statistical techniques. The results of an evaluation of MeansTest suggest that MeansTest performs well relative to its peers. This research then surveys recent work in software testing using MeansTest to evaluate the significance of researchers\u27 work. The results of the survey indicate that software testing researchers are underreporting the significance of their work

    Semi-supervised and Active Learning Models for Software Fault Prediction

    Get PDF
    As software continues to insinuate itself into nearly every aspect of our life, the quality of software has been an extremely important issue. Software Quality Assurance (SQA) is a process that ensures the development of high-quality software. It concerns the important problem of maintaining, monitoring, and developing quality software. Accurate detection of fault prone components in software projects is one of the most commonly practiced techniques that offer the path to high quality products without excessive assurance expenditures. This type of quality modeling requires the availability of software modules with known fault content developed in similar environment. However, collection of fault data at module level, particularly in new projects, is expensive and time-consuming. Semi-supervised learning and active learning offer solutions to this problem for learning from limited labeled data by utilizing inexpensive unlabeled data.;In this dissertation, we investigate semi-supervised learning and active learning approaches in the software fault prediction problem. The role of base learner in semi-supervised learning is discussed using several state-of-the-art supervised learners. Our results showed that semi-supervised learning with appropriate base learner leads to better performance in fault proneness prediction compared to supervised learning. In addition, incorporating pre-processing technique prior to semi-supervised learning provides a promising direction to further improving the prediction performance. Active learning, sharing the similar idea as semi-supervised learning in utilizing unlabeled data, requires human efforts for labeling fault proneness in its learning process. Empirical results showed that active learning supplemented by dimensionality reduction technique performs better than the supervised learning on release-based data sets