11,485 research outputs found

    Understanding replication of experiments in software engineering: a classification

    Get PDF
    Context: Replication plays an important role in experimental disciplines. There are still many uncertain- ties about how to proceed with replications of SE experiments. Should replicators reuse the baseline experiment materials? How much liaison should there be among the original and replicating experiment- ers, if any? What elements of the experimental configuration can be changed for the experiment to be considered a replication rather than a new experiment? Objective: To improve our understanding of SE experiment replication, in this work we propose a classi- fication which is intend to provide experimenters with guidance about what types of replication they can perform. Method: The research approach followed is structured according to the following activities: (1) a litera- ture review of experiment replication in SE and in other disciplines, (2) identification of typical elements that compose an experimental configuration, (3) identification of different replications purposes and (4) development of a classification of experiment replications for SE. Results: We propose a classification of replications which provides experimenters in SE with guidance about what changes can they make in a replication and, based on these, what verification purposes such a replication can serve. The proposed classification helped to accommodate opposing views within a broader framework, it is capable of accounting for less similar replications to more similar ones regarding the baseline experiment. Conclusion: The aim of replication is to verify results, but different types of replication serve special ver- ification purposes and afford different degrees of change. Each replication type helps to discover partic- ular experimental conditions that might influence the results. The proposed classification can be used to identify changes in a replication and, based on these, understand the level of verification

    Heterogeneity of Research Results: A New Perspective From Which to Assess and Promote Progress in Psychological Science

    Get PDF
    Heterogeneity emerges when multiple close or conceptual replications on the same subject produce results that vary more than expected from the sampling error. Here we argue that unexplained heterogeneity reflects a lack of coherence between the concepts applied and data observed and therefore a lack of understanding of the subject matter. Typical levels of heterogeneity thus offer a useful but neglected perspective on the levels of understanding achieved in psychological science. Focusing on continuous outcome variables, we surveyed heterogeneity in 150 meta-analyses from cognitive, organizational, and social psychology and 57 multiple close replications. Heterogeneity proved to be very high in meta-analyses, with powerful moderators being conspicuously absent. Population effects in the average meta-analysis vary from small to very large for reasons that are typically not understood. In contrast, heterogeneity was moderate in close replications. A newly identified relationship between heterogeneity and effect size allowed us to make predictions about expected heterogeneity levels. We discuss important implications for the formulation and evaluation of theories in psychology. On the basis of insights from the history and philosophy of science, we argue that the reduction of heterogeneity is important for progress in psychology and its practical applications, and we suggest changes to our collective research practice toward this end

    Replicability or reproducibility? On the replication crisis in computational neuroscience and sharing only relevant detail

    Get PDF
    Replicability and reproducibility of computational models has been somewhat understudied by “the replication movement.” In this paper, we draw on methodological studies into the replicability of psychological experiments and on the mechanistic account of explanation to analyze the functions of model replications and model reproductions in computational neuroscience. We contend that model replicability, or independent researchers' ability to obtain the same output using original code and data, and model reproducibility, or independent researchers' ability to recreate a model without original code, serve different functions and fail for different reasons. This means that measures designed to improve model replicability may not enhance (and, in some cases, may actually damage) model reproducibility. We claim that although both are undesirable, low model reproducibility poses more of a threat to long-term scientific progress than low model replicability. In our opinion, low model reproducibility stems mostly from authors' omitting to provide crucial information in scientific papers and we stress that sharing all computer code and data is not a solution. Reports of computational studies should remain selective and include all and only relevant bits of code

    Feeling the future: A meta-analysis of 90 experiments on the anomalous anticipation of random future events

    Get PDF
    In 2011, one of the authors (DJB) published a report of nine experiments in the Journal of Personality and Social Psychology purporting to demonstrate that an individual\u2019s cognitive and affective responses can be influenced by randomly selected stimulus events that do not occur until after his or her responses have already been made and recorded, a generalized variant of the phenomenon traditionally denoted by the term precognition. To encourage replications, all materials needed to conduct them were made available on request. We here report a meta-analysis of 90 experiments from 33 laboratories in 14 countries which yielded an overall effect greater than 6 sigma, z = 6.40, p = 1.2 7 10 with an effect size (Hedges\u2019 g) of 0.09. A Bayesian analysis yielded a Bayes Factor of 5.1 7 10 , greatly exceeding the criterion value of 100 for \u201cdecisive evidence\u201d in support of the experimental hypothesis. When DJB\u2019s original experiments are excluded, the combined effect size for replications by independent investigators is 0.06, z = 4.16, p = 1.1 7 10 , and the BF value is 3,853, again exceeding the criterion for \u201cdecisive evidence.\u201d The number of potentially unretrieved experiments required to reduce the overall effect size of the complete database to a trivial value of 0.01 is 544, and seven of eight additional statistical tests support the conclusion that the database is not significantly compromised by either selection bias or by intense \u201cp -hacking\u201d\u2014the selective suppression of findings or analyses that failed to yield statistical significance. P-curve analysis, a recently introduced statistical technique, estimates the true effect size of the experiments to be 0.20 for the complete database and 0.24 for the independent replications, virtually identical to the effect size of DJB\u2019s original experiments (0.22) and the closely related \u201cpresentiment\u201d experiments (0.21). We discuss the controversial status of precognition and other anomalous effects collectively known as psi

    Does “Evaluating Journal Quality and the Association for Information Systems Senior Scholars Journal Basket…” Support the Basket with Bibliometric Measures?

    Get PDF
    We re-examine “Evaluating Journal Quality and the Association for Information Systems Senior Scholars Journal Basket…” by Lowry et al. (2013). They sought to use bibliometric methods to validate the Basket as the eight top quality journals that are “strictly speaking, IS journals” (Lowry et al., 2013, pp. 995, 997). They examined 21 journals out of 140 journals considered as possible IS journals. We also expand the sample to 73 of the 140 journals. Our sample includes a wider range of approaches to IS, although all were suggested by IS scholars in a survey by Lowry and colleagues. We also use the same sample of 21 journals in Lowry et al. with the same methods of analysis so far as possible. With the narrow sample, we replicate Lowry et al. as closely as we can, whereas with the broader sample we employ a conceptual replication. This latter replication also employs alternative methods. For example, we consider citations (a quality measure) and centrality (a relevance measure in this context) as distinct, rather than merging them as in Lowry et al. High centrality scores from the sample of 73 journals do not necessarily indicate close connections with IS. Therefore, we determine which journals are of high quality and closely connected with the Basket and with their sample. These results support the broad purpose of Lowry et al., finding a wider set of high quality and relevant journals than just MISQ and ISR, and find a wider set of relevant, top quality journals

    Using configuration management and product line software paradigms to support the experimentation process in software engineering.

    Get PDF
    There is no empirical evidence whatsoever to support most of the beliefs on which software construction is based. We do not yet know the adequacy, limits, qualities, costs and risks of the technologies used to develop software. Experimentation helps to check and convert beliefs and opinions into facts. This research is concerned with the replication area. Replication is a key component for gathering empirical evidence on software development that can be used in industry to build better software more efficiently. Replication has not been an easy thing to do in software engineering (SE) because the experimental paradigm applied to software development is still immature. Nowadays, a replication is executed mostly using a traditional replication package. But traditional replication packages do not appear, for some reason, to have been as effective as expected for transferring information among researchers in SE experimentation. The trouble spot appears to be the replication setup, caused by version management problems with materials, instruments, documents, etc. This has proved to be an obstacle to obtaining enough details about the experiment to be able to reproduce it as exactly as possible. We address the problem of information exchange among experimenters by developing a schema to characterize replications. We will adapt configuration management and product line ideas to support the experimentation process. This will enable researchers to make systematic decisions based on explicit knowledge rather than assumptions about replications. This research will output a replication support web environment. This environment will not only archive but also manage experimental materials flexibly enough to allow both similar and differentiated replications with massive experimental data storage. The platform should be accessible to several research groups working together on the same families of experiments

    Discovery and Communication of Important Marketing Findings: Evidence and Proposals

    Get PDF
    My review of empirical research on scientific publication led to the following conclusions. Three criteria are useful for identifying whether findings are important: replication, validity, and usefulness. A fourth criterion, surprise, applies in some situations. Based on these criteria, important findings resulting from academic research in marketing seem to be rare. To a large extent, this rarity is due to a reward system that is built around subjective peer review. Rather than using peer review as a secret screening process, using an open process likely will improve papers and inform readers. Researchers, journals, business schools, funding agencies, and professional organizations can all contribute to improving the process. For example, researchers should do directed research on papers that contribute to principles. Journals should invite papers that contribute to principles. Business school administrators should reward researchers who make important findings. Funding agencies should base decisions on researchers' prior success in making important findings, and professional organizations should maintain web sites that describe what is known about principles and what research is needed on principles
    • …
    corecore