10 research outputs found

    Feature Subset Selection and Instance Filtering for Cross-project Defect Prediction - Classification and Ranking

    No full text
    The defect prediction models can be a good tool on organizing the projectΒ΄s test resources. The models can be constructed with two main goals: 1) to classify the software parts - defective or not; or 2) to rank the most defective parts in a decreasing order. However, not all companies maintain an appropriate set of historical defect data. In this case, a company can build an appropriate dataset from known external projects - called Cross-project Defect Prediction (CPDP). The CPDP models, however, present low prediction performances due to the heterogeneity of data. Recently, Instance Filtering methods were proposed in order to reduce this heterogeneity by selecting the most similar instances from the training dataset. Originally, the similarity is calculated based on all the available dataset features (or independent variables). We propose that using only the most relevant features on the similarity calculation can result in more accurate filtered datasets and better prediction performances. In this study we extend our previous work. We analyse both prediction goals - Classification and Ranking. We present an empirical evaluation of 41 different methods by associating Instance Filtering methods with Feature Selection methods. We used 36 versions of 11 open source projects on experiments. The results show similar evidences for both prediction goals. First, the defect prediction performance of CPDP models can be improved by associating Feature Selection and Instance Filtering. Second, no evaluated method presented general better performances. Indeed, the most appropriate method can vary according to the characteristics of the project being predicted

    A transformational language for mutant description

    No full text
    Mutation testing has been used to assess the quality of test case suites by analyzing the ability in distinguishing the artifact under testing from a set of alternative artifacts, the so-called mutants. The mutants are generated from the artifact under testing by applying a set of mutant operators, which produce artifacts with simple syntactical differences. The mutant operators are usually based on typical errors that occur during the software development and can be related to a fault model. In this paper, we propose a language-named MuDeL (MUtant DEfinition Language)-for the definition of mutant operators, aiming not only at automating the mutant generation, but also at providing precision and formality to the operator definition. The proposed language is based on concepts from transformational and logical programming paradigms, as well as from context-free grammar theory. Denotational semantics formal framework is employed to define the semantics of the MuDeL language. We also describe a system-named mudelgen-developed to support the use of this language. An executable representation of the denotational semantics of the language is used to check the correctness of the implementation of mudelgen. At the very end, a mutant generator module is produced, which can be incorporated into a specific mutant tool/environment. (C) 2008 Elsevier Ltd. All rights reserved

    Learning by Sampling: Learning Behavioral Family Models from Software Product Lines

    Full text link
    Family-based behavioral analysis operates on a single specification artifact, referred to as family model, annotated with feature constraints to express behavioral variability in terms of conditional states and transitions. Family-based behavioral modeling paves the way for efficient model-based analysis of software product lines. Family-based behavioral model learning incorporates feature model analysis and model learning principles to efficiently unify product models into a family model and integrate the behavior of various products into a behavioral family model. Albeit reasonably effective, the exhaustive analysis of product lines is often infeasible due to the potentially exponential number of valid configurations. In this paper, we first present a family-based behavioral model learning techniques, called FFSMDiff. Subsequently, we report on our experience on learning family models by employing product sampling. Using 105 products of six product lines expressed in terms of Mealy machines, we evaluate the precision of family models learned from products selected from different settings of the T-wise product sampling criterion. We show that product sampling can lead to models as precise as those learned by exhaustive analysis and hence, reduce the costs for family model learning

    Preface to the special issues devoted to CLEI 2015

    No full text
    As joint invited editors, we are proud to present the April 2016 and August 2016 issues of CLEIej, which include a number of revised and reviewed versions of the best papers presented at CLEI 2015 in Arequipa, Peru in October 2015. Authors were asked to prepare extended papers with new contributions with respect to the conference versions; a total of 16 papers were finally accepted and are now published in these two issues
    corecore