18 research outputs found

    Replication types: towards a shared taxonomy

    Get PDF
    Context: The software engineering community is becoming more aware of the need for experimental replications. In spite of the importance of this topic, there is still much inconsistency in the terminology used to describe replications. Objective: Understand the perspectives of empirical researchers about various terms used to characterize replications and propose a consistent taxonomy of terms. Method: A survey followed by plenary discussion during the 2013 International Software Engineering Research Network meeting. Results: We propose a taxonomy which consolidates the disparate terminology. This taxonomy had a high level of agreement among workshop attendees. Conclusion: Consistent terminology is important for any field to progress. This work is the first step in that direction. Additional study and discussion is still necessary

    Replications of software engineering experiments

    Get PDF
    There are many open issues that must be addressed before the replication process can be successfully formalized in empirical software engineering research. We define replication as the deliberate repetition of the same empirical study for the purpose of determining whether the results of the first experiment can be reproduced. This definition would appear at first glance to be good. However, it needs several clarifications that have not yet been forthcoming in software engineering: – What is the exact meaning of the same empirical study? Namely how similar should an experiment be to the baseline study for it to be considered a replication? What is the exact meaning of a result being reproduced? Namely how similar does a result have to be to the result of the baseline study for it to be considered reproduced? These and other methodological questions need to be researched and tailored for empirical software engineering

    Replication studies considered harmful

    Get PDF
    Context: There is growing interest in establishing software engineering as an evidence-based discipline. To that end, replication is often used to gain confidence in empirical findings, as opposed to reproduction where the goal is showing the correctness, or validity of the published results. Objective: To consider what is required for a replication study to confirm the original experiment and apply this understanding in software engineering. Method: Simulation is used to demonstrate why the prediction interval for confirmation can be surprisingly wide. This analysis is applied to three recent replications. Results: It is shown that because the prediction intervals are wide, almost all replications are confirmatory, so in that sense there is no ‘replication crisis’, however, the contributions to knowledge are negligible. Conclusion: Replicating empirical software engineering experiments, particularly if they are under-powered or under-reported, is a waste of scientific resources. By contrast, meta-analysis is strongly advocated so that all relevant experiments are combined to estimate the population effect

    Investigation of individual factors impacting the effectiveness of requirements inspections: a replicated experiment

    Get PDF
    Cataloged from PDF version of article.This paper presents a replication of an empirical study regarding the impact of individual factors on the effectiveness of requirements inspections. Experimental replications are important for verifying results and investigating the generality of empirical studies. We utilized the lab package and procedures from the original study, with some changes and additions, to conduct the replication with 69 professional developers in three different companies in Turkey. In general the results of the replication were consistent with those of the original study. The main result from the original study, which is supported in the replication, was that inspectors whose degree is in a field related to software engineering are less effective during a requirements inspection than inspectors whose degrees are in other fields. In addition, we found that Company, Experience, and English Proficiency impacted inspection effectiveness

    Understanding replication of experiments in software engineering: a classification

    Get PDF
    Context: Replication plays an important role in experimental disciplines. There are still many uncertain- ties about how to proceed with replications of SE experiments. Should replicators reuse the baseline experiment materials? How much liaison should there be among the original and replicating experiment- ers, if any? What elements of the experimental configuration can be changed for the experiment to be considered a replication rather than a new experiment? Objective: To improve our understanding of SE experiment replication, in this work we propose a classi- fication which is intend to provide experimenters with guidance about what types of replication they can perform. Method: The research approach followed is structured according to the following activities: (1) a litera- ture review of experiment replication in SE and in other disciplines, (2) identification of typical elements that compose an experimental configuration, (3) identification of different replications purposes and (4) development of a classification of experiment replications for SE. Results: We propose a classification of replications which provides experimenters in SE with guidance about what changes can they make in a replication and, based on these, what verification purposes such a replication can serve. The proposed classification helped to accommodate opposing views within a broader framework, it is capable of accounting for less similar replications to more similar ones regarding the baseline experiment. Conclusion: The aim of replication is to verify results, but different types of replication serve special ver- ification purposes and afford different degrees of change. Each replication type helps to discover partic- ular experimental conditions that might influence the results. The proposed classification can be used to identify changes in a replication and, based on these, understand the level of verification

    The evolution of the laws of software evolution. A discussion based on a systematic literature review

    Get PDF
    After more than 40 years of life, software evolution should be considered as a mature field. However, despite such a long history, many research questions still remain open, and controversial studies about the validity of the laws of software evolution are common. During the first part of these 40 years the laws themselves evolved to adapt to changes in both the research and the software industry environments. This process of adaption to new paradigms, standards, and practices stopped about 15 years ago, when the laws were revised for the last time. However, most controversial studies have been raised during this latter period. Based on a systematic and comprehensive literature review, in this paper we describe how and when the laws, and the software evolution field, evolved. We also address the current state of affairs about the validity of the laws, how they are perceived by the research community, and the developments and challenges that are likely to occur in the coming years

    Analyzing and Predicting Effort Associated with Finding and Fixing Software Faults

    Get PDF
    Context: Software developers spend a significant amount of time fixing faults. However, not many papers have addressed the actual effort needed to fix software faults. Objective: The objective of this paper is twofold: (1) analysis of the effort needed to fix software faults and how it was affected by several factors and (2) prediction of the level of fix implementation effort based on the information provided in software change requests. Method: The work is based on data related to 1200 failures, extracted from the change tracking system of a large NASA mission. The analysis includes descriptive and inferential statistics. Predictions are made using three supervised machine learning algorithms and three sampling techniques aimed at addressing the imbalanced data problem. Results: Our results show that (1) 83% of the total fix implementation effort was associated with only 20% of failures. (2) Both safety critical failures and post-release failures required three times more effort to fix compared to non-critical and pre-release counterparts, respectively. (3) Failures with fixes spread across multiple components or across multiple types of software artifacts required more effort. The spread across artifacts was more costly than spread across components. (4) Surprisingly, some types of faults associated with later life-cycle activities did not require significant effort. (5) The level of fix implementation effort was predicted with 73% overall accuracy using the original, imbalanced data. Using oversampling techniques improved the overall accuracy up to 77%. More importantly, oversampling significantly improved the prediction of the high level effort, from 31% to around 85%. Conclusions: This paper shows the importance of tying software failures to changes made to fix all associated faults, in one or more software components and/or in one or more software artifacts, and the benefit of studying how the spread of faults and other factors affect the fix implementation effort

    Empirical validation of a usability inspection method for model-driven Web development

    Full text link
    Web applications should be usable in order to be accepted by users and to improve their success probability. Despite the fact that this requirement has promoted the emergence of several usability evaluation methods, there is a need for empirically validated methods that provide evidence about their effectiveness and that can be properly integrated into early stages of Web development processes. Model-driven Web development processes have grown in popularity over the last few years, and offer a suitable context in which to perform early usability evaluations due to their intrinsic traceability mechanisms. These issues have motivated us to propose a Web Usability Evaluation Process (WUEP) which can be integrated into model-driven Web development processes. This paper presents a family of experiments that we have carried out to empirically validate WUEP. The family of experiments was carried out by 64 participants, including PhD and Master¿s computer science students. The objective of the experiments was to evaluate the participants¿ effectiveness, efficiency, perceived ease of use and perceived satisfaction when using WUEP in comparison to an industrial widely used inspection method: Heuristic Evaluation (HE). The statistical analysis and meta-analysis of the data obtained separately from each experiment indicated that WUEP is more effective and efficient than HE in the detection of usability problems. The evaluators were also more satisfied when applying WUEP, and found it easier to use than HE. Although further experiments must be carried out to strengthen these results, WUEP has proved to be a promising usability inspection method for Web applications which have been developed by using model-driven development processes.The authors would like to thank all the participants in the experiments, along with the usability experts that supported certain tasks of the evaluation design stage, and of which the control group was composed. This research work is funded by the MULTIPLE project (TIN2009-13838) and the FPU program (AP2007-03731) from the Spanish Ministry of Science and Education.Fernández Martínez, A.; Abrahao Gonzales, SM.; Insfrán Pelozo, CE. (2013). Empirical validation of a usability inspection method for model-driven Web development. Journal of Systems and Software. 86(1):161-186. https://doi.org/10.1016/j.jss.2012.07.043S16118686

    Investigation of individual factors impacting the effectiveness of requirements inspections: a replicated experiment

    Get PDF
    Abstract This paper presents a replication of an empirical study regarding the impact of individual factors on the effectiveness of requirements inspections. Experimental replications are important for verifying results and investigating the generality of empirical studies. We utilized the lab package and procedures from the original study, with some changes and additions, to conduct the replication with 69 professional developers in three different companies in Turkey. In general the results of the replication were consistent with those of the original study. The main result from the original study, which is supported in the replication, was that inspectors whose degree is in a field related to software engineering are less effective during a requirements inspection than inspectors whose degrees are in other fields. In addition, we found that Company, Experience, and English Proficiency impacted inspection effectiveness
    corecore