18 research outputs found

    Some Approaches for Software Defect Prediction

    Get PDF
    Käesoleva töö peamiseks eesmärgiks on anda üldisem ülevaade protsessidest tarkvara vigade hindamise mudelites, mis kasutavad masinõppe klassifikaatoreid, ja analüüsida mõningaid hindamiseskperimentide tulemusi, mis on läbi viidud antud töös refereeritud uurimistöödes. Lisaks on antud lühike selgitus antud töös vaadeldavates tarkvara vigade hindamise mudelites kasutatud algoritmidest ja tuuakse välja ning seletatakse lahti mõned hinnangumõõdikud, mida kasutatakse tarkvara vigade hindamise mudelite hindamistäpsuste mõõtmiseks. Tuuakse välja ka üldine ülevaade vaadeldavates tarkvara vigade hindamise mudelites toimuvatest protsessidest.The main idea of this thesis is to give a general overview of the processes within the soft-ware defect prediction models using machine learning classifiers and to provide analysis to some of the results of the evaluation experiments conducted in the research papers covered in this work. Additionally, a brief explanation of the algorithms used within the software defect prediction models covered in this work is given and some of the evaluation measures used to evaluate the prediction accuracy of software defect prediction models are listed and explained. Also, a general overview of the processes within a handful of specific software defect prediction models is provided

    Software Repository Mining Analytics to Estimate Software Component Reliability

    Get PDF
    Encontrar e corrigir bugs em Software tem um grande custo e impacto no esforço em desenvolver Software. Os repositórios escondem informação preditiva sobre o histórico de Software que pode ser explorada recorrendo a técnicas de análise e de machine learning. A investigação atual de Mining Software Repositories (MSR) é capaz de classificar e listar componentes defeituosos com a granularidade ao nível do ficheiro. Os nossos objetivos são prever defeitos em Software com a granularidade até ao método, ao minar repositórios, e melhorar as técnicas de localização de falhas atuais com os resultados da previsão de defeitos. Foi implementada uma ferramenta denominada de Schwa, disponível livremente no Github, que é capaz de analisar repositórios Git. Para Java e outras linguagens conseguimos chegar à granularidade ao nível do método e ficheiro, respectivamente. Métricas como as revisões, correcções de bugs e autores são analisadas e usadas para alimentar o modelo de previsão com relevância temporal. Esta tese faz as seguintes contribuições: um método para interpretar e representar diffs de patches com a granularidade ao método; um modelo para calcular probabilidades de defeito; uma framework para minar repositórios de Software; uma técnica para aprender a importância das métricas analisadas; um método para avaliar o ganho de usar as probabilidade de defeito em localização de falhas.Finding and fixing software bugs is expensive and has a significant impact in Software development effort. Repositories have hidden predictive information about Software history that can be explored using analytics and machine learning techniques. Current research in Mining Software Repositories (MSR) is capable of ranking and listing faulty components at the file granularity. Our goals are predicting Software defects with method granularity, by mining repositories, and improve current fault localization techniques with the results from defect prediction. We have implemented a tool called Schwa, available for free on Github, that is capable of analyzing Git repositories. For Java and others languages we achieved method and file granularity, respectively. Metrics such as revisions, fixes and authors are tracked and used to feed the prediction model with time relevance. This thesis does the following contributions: a method to parse and represent diffs from patches with method granularity; a model to compute defect probabilities; a framework for mining Software repositories; a technique to learn the importance of tracked metrics; a method to evaluate the gain of using defect probabilities in fault localization

    Variationally consistent computational homogenization of chemomechanical problems with stabilized weakly periodic boundary conditions

    Get PDF
    A variationally consistent model-based computational homogenization approach for transient chemomechanically coupled problems is developed based on the classical assumption of first-order prolongation of the displacement, chemical potential, and (ion) concentration fields within a representative volume element (RVE). The presence of the chemical potential and the concentration as primary global fields represents a mixed formulation, which has definite advantages. Nonstandard diffusion, governed by a Cahn–Hilliard type of gradient model, is considered under the restriction of miscibility. Weakly periodic boundary conditions on the pertinent fields provide the general variational setting for the uniquely solvable RVE-problem(s). These boundary conditions are introduced with a novel approach in order to control the stability of the boundary discretization, thereby circumventing the need to satisfy the LBB-condition: the penalty stabilized Lagrange multiplier formulation, which enforces stability at the cost of an additional Lagrange multiplier for each weakly periodic field (three fields for the current problem). In particular, a neat result is that the classical Neumann boundary condition is obtained when the penalty becomes very large. In the numerical examples, we investigate the following characteristics: the mesh convergence for different boundary approximations, the sensitivity for the choice of penalty parameter, and the influence of RVE-size on the macroscopic response

    The Enriched Object Oriented Software processes for Software Fault Prediction

    Get PDF
    A software fault prediction is a demonstrated strategy in accomplishing high software unwavering quality. Prediction of fault-inclined modules gives one approach to help software quality designing through enhanced booking and venture control. Quality of software is progressively imperative and testing related issues are getting to be noticeably pivotal for software. This requires the need to build up a constant evaluation procedure that groups these progressively created frameworks as being faulty/sans fault. An assortment of software fault predictions procedures have been proposed, In fact different methodologies created by the numerous researchers, they may not be optimal while predication of faults. In this approach we are presenting the fault prediction approach with OO metrics alongside cyclomatic complexity and nested block depth, in acceptance testing, each capacity determined in the plan report can be freely tried, that is, an arrangement of experiments is produced for each capacity, not for every work process module or other module/segment. Our test results demonstrate the productive fault prediction with our algorithm parameters. Our approach predominantly focuses on the tally of faults before testing, expected number of faults, our classification which includes algorithmic and handling, control, rationale and succession, typographical Syntax blunders i.e. off base spelling of a variable name, customary cycle of articulations, off base instatement proclamations per module, this proposed classification approach demonstrates optimal results while analyzing the metrics with preparing tests after estimation

    bflinks: Reliable Bugfix links via bidirectional references and tuned heuristics

    Get PDF
    Background: Data from software data repositories such as source code version archives and defect databases contains valuable information that can be used for insights (leading to subsequent improvements), in particular defect insertion circumstance analysis and defect prediction. The first step in such analyses is identifying defect-correcting changes in the version archive (bugfix commits) and linking them to corresponding entries in the defect database, thus establishing bugfix links, in order to enrich the content of the defect-correcting change with additional meta-data. Typically, identifying the bugfix commits in a version archive is done via heuristic string matching on the commit message. Research questions: Which filters could be used to obtain a set of bugfix links? How does one set the cutoff parameters of each? What effect (results loss and precision) does each filter then have? Which overall precision, results loss, and recall is achieved? Method: We analyze a comprehensive modular set of seven independent filters, including new ones that make use of reverse links. We describe and evaluate visual heuristics (based on simple diagnostic plots) for setting six filters' cutoff parameter. We apply these to a commercial repository from the Web CMS domain and validate the results with unprecendented precision by making use of a product expert to manually verify over 2500 links. Results: The parameter selection heuristics pick a very good parameter value in five of the six cases and a reasonably good one in the sixth. As a result, the combined filtering, called bflinks, proposes a set of bugfix links that has 93\% precision with only 7\% results loss. Conclusion: The modular filtering approach can provide high-quality results and can be adapted to repositories with different properties

    Time variance and defect prediction in software projects: Towards an exploitation of periods of stability and change as well as a notion of concept drift in software projects

    Get PDF
    It is crucial for a software manager to know whether or not one can rely on a bug prediction model. A wrong prediction of the number or the location of future bugs can lead to problems in the achievement of a project's goals. In this paper we first verify the existence of variability in a bug prediction model's accuracy over time both visually and statistically. Furthermore, we explore the reasons for such a high variability over time, which includes periods of stability and variability of prediction quality, and formulate a decision procedure for evaluating prediction models before applying them. To exemplify our findings we use data from four open source projects and empirically identify various project features that influence the defect prediction quality. Specifically, we observed that a change in the number of authors editing a file and the number of defects fixed by them influence the prediction quality. Finally, we introduce an approach to estimate the accuracy of prediction models that helps a project manager decide when to rely on a prediction model. Our findings suggest that one should be aware of the periods of stability and variability of prediction quality and should use approaches such as ours to assess their models' accuracy in advanc

    Exploiting Abstract Syntax Trees to Locate Software Defects

    Get PDF
    Context. Software defect prediction aims to reduce the large costs involved with faults in a software system. A wide range of traditional software metrics have been evaluated as potential defect indicators. These traditional metrics are derived from the source code or from the software development process. Studies have shown that no metric clearly out performs another and identifying defect-prone code using traditional metrics has reached a performance ceiling. Less traditional metrics have been studied, with these metrics being derived from the natural language of the source code. These newer, less traditional and finer grained metrics have shown promise within defect prediction. Aims. The aim of this dissertation is to study the relationship between short Java constructs and the faultiness of source code. To study this relationship this dissertation introduces the concept of a Java sequence and Java code snippet. Sequences are created by using the Java abstract syntax tree. The ordering of the nodes within the abstract syntax tree creates the sequences, while small sub sequences of this sequence are the code snippets. The dissertation tries to find a relationship between the code snippets and faulty and non-faulty code. This dissertation also looks at the evolution of the code snippets as a system matures, to discover whether code snippets significantly associated with faulty code change over time. Methods. To achieve the aims of the dissertation, two main techniques have been developed; finding defective code and extracting Java sequences and code snippets. Finding defective code has been split into two areas - finding the defect fix and defect insertion points. To find the defect fix points an implementation of the bug-linking algorithm has been developed, called S + e . Two algorithms were developed to extract the sequences and the code snippets. The code snippets are analysed using the binomial test to find which ones are significantly associated with faulty and non-faulty code. These techniques have been performed on five different Java datasets; ArgoUML, AspectJ and three releases of Eclipse.JDT.core Results. There are significant associations between some code snippets and faulty code. Frequently occurring fault-prone code snippets include those associated with identifiers, method calls and variables. There are some code snippets significantly associated with faults that are always in faulty code. There are 201 code snippets that are snippets significantly associated with faults across all five of the systems. The technique is unable to find any significant associations between code snippets and non-faulty code. The relationship between code snippets and faults seems to change as the system evolves with more snippets becoming fault-prone as Eclipse.JDT.core evolved over the three releases analysed. Conclusions. This dissertation has introduced the concept of code snippets into software engineering and defect prediction. The use of code snippets offers a promising approach to identifying potentially defective code. Unlike previous approaches, code snippets are based on a comprehensive analysis of low level code features and potentially allow the full set of code defects to be identified. Initial research into the relationship between code snippets and faults has shown that some code constructs or features are significantly related to software faults. The significant associations between code snippets and faults has provided additional empirical evidence to some already researched bad constructs within defect prediction. The code snippets have shown that some constructs significantly associated with faults are located in all five systems, and although this set is small finding any defect indicators that transfer successfully from one system to another is rare
    corecore