31,983 research outputs found
Nonparametric Bayesian multiple testing for longitudinal performance stratification
This paper describes a framework for flexible multiple hypothesis testing of
autoregressive time series. The modeling approach is Bayesian, though a blend
of frequentist and Bayesian reasoning is used to evaluate procedures.
Nonparametric characterizations of both the null and alternative hypotheses
will be shown to be the key robustification step necessary to ensure reasonable
Type-I error performance. The methodology is applied to part of a large
database containing up to 50 years of corporate performance statistics on
24,157 publicly traded American companies, where the primary goal of the
analysis is to flag companies whose historical performance is significantly
different from that expected due to chance.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS252 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A comparison of the Benjamini-Hochberg procedure with some Bayesian rules for multiple testing
In the spirit of modeling inference for microarrays as multiple testing for
sparse mixtures, we present a similar approach to a simplified version of
quantitative trait loci (QTL) mapping. Unlike in case of microarrays, where the
number of tests usually reaches tens of thousands, the number of tests
performed in scans for QTL usually does not exceed several hundreds. However,
in typical cases, the sparsity of significant alternatives for QTL mapping
is in the same range as for microarrays. For methodological interest, as well
as some related applications, we also consider non-sparse mixtures. Using
simulations as well as theoretical observations we study false discovery rate
(FDR), power and misclassification probability for the Benjamini-Hochberg (BH)
procedure and its modifications, as well as for various parametric and
nonparametric Bayes and Parametric Empirical Bayes procedures. Our results
confirm the observation of Genovese and Wasserman (2002) that for small p the
misclassification error of BH is close to optimal in the sense of attaining the
Bayes oracle. This property is shared by some of the considered Bayes testing
rules, which in general perform better than BH for large or moderate 's.Comment: Published in at http://dx.doi.org/10.1214/193940307000000158 the IMS
Collections (http://www.imstat.org/publications/imscollections.htm) by the
Institute of Mathematical Statistics (http://www.imstat.org
On nonparametric estimation of a mixing density via the predictive recursion algorithm
Nonparametric estimation of a mixing density based on observations from the
corresponding mixture is a challenging statistical problem. This paper surveys
the literature on a fast, recursive estimator based on the predictive recursion
algorithm. After introducing the algorithm and giving a few examples, I
summarize the available asymptotic convergence theory, describe an important
semiparametric extension, and highlight two interesting applications. I
conclude with a discussion of several recent developments in this area and some
open problems.Comment: 22 pages, 5 figures. Comments welcome at
https://www.researchers.one/article/2018-12-
Evolution of statistical analysis in empirical software engineering research: Current state and steps forward
Software engineering research is evolving and papers are increasingly based
on empirical data from a multitude of sources, using statistical tests to
determine if and to what degree empirical evidence supports their hypotheses.
To investigate the practices and trends of statistical analysis in empirical
software engineering (ESE), this paper presents a review of a large pool of
papers from top-ranked software engineering journals. First, we manually
reviewed 161 papers and in the second phase of our method, we conducted a more
extensive semi-automatic classification of papers spanning the years 2001--2015
and 5,196 papers. Results from both review steps was used to: i) identify and
analyze the predominant practices in ESE (e.g., using t-test or ANOVA), as well
as relevant trends in usage of specific statistical methods (e.g.,
nonparametric tests and effect size measures) and, ii) develop a conceptual
model for a statistical analysis workflow with suggestions on how to apply
different statistical methods as well as guidelines to avoid pitfalls. Lastly,
we confirm existing claims that current ESE practices lack a standard to report
practical significance of results. We illustrate how practical significance can
be discussed in terms of both the statistical analysis and in the
practitioner's context.Comment: journal submission, 34 pages, 8 figure
Wavelet Estimators in Nonparametric Regression: A Comparative Simulation Study
Wavelet analysis has been found to be a powerful tool for the nonparametric estimation of spatially-variable objects. We discuss in detail wavelet methods in nonparametric regression, where the data are modelled as observations of a signal contaminated with additive Gaussian noise, and provide an extensive review of the vast literature of wavelet shrinkage and wavelet thresholding estimators developed to denoise such data. These estimators arise from a wide range of classical and empirical Bayes methods treating either individual or blocks of wavelet coefficients. We compare various estimators in an extensive simulation study on a variety of sample sizes, test functions, signal-to-noise ratios and wavelet filters. Because there is no single criterion that can adequately summarise the behaviour of an estimator, we use various criteria to measure performance in finite sample situations. Insight into the performance of these estimators is obtained from graphical outputs and numerical tables. In order to provide some hints of how these estimators should be used to analyse real data sets, a detailed practical step-by-step illustration of a wavelet denoising analysis on electrical consumption is provided. Matlab codes are provided so that all figures and tables in this paper can be reproduced
- …