8,092 research outputs found

    The Risk of Using the Q Heterogeneity Estimator for Software Engineering Experiments

    Full text link
    All meta-analyses should include a heterogeneity analysis. Even so, it is not easy to decide whether a set of studies are homogeneous or heterogeneous because of the low statistical power of the statistics used (usually the Q test). Objective: Determine a set of rules enabling SE researchers to find out, based on the characteristics of the experiments to be aggregated, whether or not it is feasible to accurately detect heterogeneity. Method: Evaluate the statistical power of heterogeneity detection methods using a Monte Carlo simulation process. Results: The Q test is not powerful when the meta-analysis contains up to a total of about 200 experimental subjects and the effect size difference is less than 1. Conclusions: The Q test cannot be used as a decision-making criterion for meta-analysis in small sample settings like SE. Random effects models should be used instead of fixed effects models. Caution should be exercised when applying Q test-mediated decomposition into subgroups

    Simulation Experiments in Practice: Statistical Design and Regression Analysis

    Get PDF
    In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. Statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic theory assumes a single simulation response that is normally and independently distributed with a constant variance; moreover, the regression (meta)model of the simulation model’s I/O behaviour is assumed to have residuals with zero means. This article addresses the following questions: (i) How realistic are these assumptions, in practice? (ii) How can these assumptions be tested? (iii) If assumptions are violated, can the simulation's I/O data be transformed such that the assumptions do hold? (iv) If not, which alternative statistical methods can then be applied?metamodels;experimental designs;generalized least squares;multivariate analysis;normality;jackknife;bootstrap;heteroscedasticity;common random numbers;validation

    Simulation Experiments in Practice: Statistical Design and Regression Analysis

    Get PDF
    In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic DOE and regression analysis assume a single simulation response that is normally and independently distributed with a constant variance; moreover, the regression (meta)model of the simulation model’s I/O behaviour is assumed to have residuals with zero means. This article addresses the following practical questions: (i) How realistic are these assumptions, in practice? (ii) How can these assumptions be tested? (iii) If assumptions are violated, can the simulation's I/O data be transformed such that the assumptions do hold? (iv) If not, which alternative statistical methods can then be applied?metamodel;experimental design;jackknife;bootstrap;common random numbers;validation

    Kriging Metamodeling in Simulation: A Review

    Get PDF
    This article reviews Kriging (also called spatial correlation modeling). It presents the basic Kriging assumptions and formulas contrasting Kriging and classic linear regression metamodels. Furthermore, it extends Kriging to random simulation, and discusses bootstrapping to estimate the variance of the Kriging predictor. Besides classic one-shot statistical designs such as Latin Hypercube Sampling, it reviews sequentialized and customized designs. It ends with topics for future research.Kriging;Metamodel;Response Surface;Interpolation;Design

    Challenges of Big Data Analysis

    Full text link
    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article give overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasis on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions

    White Noise Assumptions Revisited: Regression Models and Statistical Designs for Simulation Practice

    Get PDF
    Classic linear regression models and their concomitant statistical designs assume a univariate response and white noise.By definition, white noise is normally, independently, and identically distributed with zero mean.This survey tries to answer the following questions: (i) How realistic are these classic assumptions in simulation practice?(ii) How can these assumptions be tested? (iii) If assumptions are violated, can the simulation's I/O data be transformed such that the assumptions hold?(iv) If not, which alternative statistical methods can then be applied?metamodels;experimental designs;generalized least squares;multivariate analysis;normality;jackknife;bootstrap;heteroscedasticity;common random numbers;validation
    • …
    corecore