9,336 research outputs found

    A study of subgroup discovery approaches for defect prediction

    Get PDF
    Context: Although many papers have been published on software defect prediction techniques, machine learning approaches have yet to be fully explored. Objective: In this paper we suggest using a descriptive approach for defect prediction rather than the pre-cise classification techniques that are usually adopted. This allows us to characterise defective modules with simple rules that can easily be applied by practitioners and deliver a practical (or engineering) approach rather than a highly accurate result. Method: We describe two well-known subgroup discovery algorithms, the SD algorithm and the CN2-SD algorithm to obtain rules that identify defect prone modules. The empirical work is performed with pub-licly available datasets from the Promise repository and object-oriented metrics from an Eclipse reposi-tory related to defect prediction. Subgroup discovery algorithms mitigate against characteristics of datasets that hinder the applicability of classification algorithms and so remove the need for preprocess-ing techniques. Results: The results show that the generated rules can be used to guide testing effort in order to improve the quality of software development projects. Such rules can indicate metrics, their threshold values and relationships between metrics of defective modules. Conclusions: The induced rules are simple to use and easy to understand as they provide a description rather than a complete classification of the whole dataset. Thus this paper represents an engineering approach to defect prediction, i.e., an approach which is useful in practice, easily understandable and can be applied by practitioners.ICEBERG IAPP-2012-324356MICINN TIN2011-28956-C02-0

    A Study of Subgroup Discovery Approaches for Defect Prediction

    Get PDF
    Context: Although many papers have been published on software defect prediction techniques, machine learning approaches have yet to be fully explored. Objective: In this paper we suggest using a descriptive approach for defect prediction rather than the precise classification techniques that are usually adopted. This allows us to characterise defective modules with simple rules that can easily be applied by practitioners and deliver a practical (or engineering) approach rather than a highly accurate result. Method: We describe two well-known subgroup discovery algorithms, the SD algorithm and the CN2-SD algorithm to obtain rules that identify defect prone modules. The empirical work is performed with publicly available datasets from the Promise repository and object-oriented metrics from an Eclipse repository related to defect prediction. Subgroup discovery algorithms mitigate against characteristics of datasets that hinder the applicability of classification algorithms and so remove the need for preprocessing techniques

    Searching for rules to detect defective modules: A subgroup discovery approach

    Get PDF
    Data mining methods in software engineering are becoming increasingly important as they can support several aspects of the software development life-cycle such as quality. In this work, we present a data mining approach to induce rules extracted from static software metrics characterising fault-prone modules. Due to the special characteristics of the defect prediction data (imbalanced, inconsistency, redundancy) not all classification algorithms are capable of dealing with this task conveniently. To deal with these problems, Subgroup Discovery (SD) algorithms can be used to find groups of statistically different data given a property of interest. We propose EDER-SD (Evolutionary Decision Rules for Subgroup Discovery), a SD algorithm based on evolutionary computation that induces rules describing only fault-prone modules. The rules are a well-known model representation that can be easily understood and applied by project managers and quality engineers. Thus, rules can help them to develop software systems that can be justifiably trusted. Contrary to other approaches in SD, our algorithm has the advantage of working with continuous variables as the conditions of the rules are defined using intervals. We describe the rules obtained by applying our algorithm to seven publicly available datasets from the PROMISE repository showing that they are capable of characterising subgroups of fault-prone modules. We also compare our results with three other well known SD algorithms and the EDER-SD algorithm performs well in most cases.Ministerio de Educación y Ciencia TIN2007-68084-C02-00Ministerio de Educación y Ciencia TIN2010-21715-C02-0

    Subgroup Discovery for Defect Prediction

    Full text link

    Superfluid Helium 3: Link between Condensed Matter Physics and Particle Physics

    Full text link
    The discovery of the superfluid phases of Helium 3 in 1971 opened the door to one of the most fascinating systems known in condensed matter physics. Superfluidity of Helium 3, originating from pair condensation of Helium 3 atoms, turned out to be the ideal testground for many fundamental concepts of modern physics, such as macroscopic quantum phenomena, (gauge-)symmetries and their spontaneous breakdown, topological defects, etc. Thereby the superfluid phases of Helium 3 enriched condensed matter physics enormously. In particular, they contributed significantly - and continue to do so - to our understanding of various other physical systems, from heavy fermion and high-Tc superconductors all the way to neutron stars, particle physics, gravity and the early universe. A simple introduction into the basic concepts and questions is presented.Comment: 11 pages, 2 figures; to be published in Acta Physica Polonica B [Proceedings of the XL Jubilee Cracow School of Theoretical Physics on "Quantum Phase Transitions in High Energy and Condensed Matter Physics"; 3-11 June, 2000, Zakopane, Poland

    Phase Transitions of S=1 Spinor Condensates in an Optical Lattice

    Full text link
    We study the phase diagram of spin-one polar condensates in a two dimensional optical lattice with magnetic anisotropy. We show that the topological binding of vorticity to nematic disclinations allows for a rich variety of phase transitions. These include Kosterlitz-Thouless-like transitions with a superfluid stiffness jump that can be experimentally tuned to take a continuous set of values, and a new cascaded Kosterlitz-Thouless transition, characterized by two divergent length scales. For higher integer spin bosons S, the thermal phase transition out of the planar polar phase is strongly affected by the parity of S.Comment: 9 pages, 7 figures; v4 - Expanded manuscrip

    An Algorithmic Approach to Missing Data Problem in Modeling Human Aspects in Software Development

    Get PDF
    Background: In our previous research, we built defect prediction models by using confirmation bias metrics. Due to confirmation bias developers tend to perform unit tests to make their programs run rather than breaking their code. This, in turn, leads to an increase in defect density. The performance of prediction model that is built using confirmation bias was as good as the models that were built with static code or churn metrics. Aims: Collection of confirmation bias metrics may result in partially "missing data" due to developers' tight schedules, evaluation apprehension and lack of motivation as well as staff turnover. In this paper, we employ Expectation-Maximization (EM) algorithm to impute missing confirmation bias data. Method: We used four datasets from two large-scale companies. For each dataset, we generated all possible missing data configurations and then employed Roweis' EM algorithm to impute missing data. We built defect prediction models using the imputed data. We compared the performances of our proposed models with the ones that used complete data. Results: In all datasets, when missing data percentage is less than or equal to 50% on average, our proposed model that used imputed data yielded performance results that are comparable with the performance results of the models that used complete data. Conclusions: We may encounter the "missing data" problem in building defect prediction models. Our results in this study showed that instead of discarding missing or noisy data, in our case confirmation bias metrics, we can use effective techniques such as EM based imputation to overcome this problem
    corecore