53,052 research outputs found
A Hierarchical Spatio-Temporal Statistical Model Motivated by Glaciology
In this paper, we extend and analyze a Bayesian hierarchical spatio-temporal
model for physical systems. A novelty is to model the discrepancy between the
output of a computer simulator for a physical process and the actual process
values with a multivariate random walk. For computational efficiency, linear
algebra for bandwidth limited matrices is utilized, and first-order emulator
inference allows for the fast emulation of a numerical partial differential
equation (PDE) solver. A test scenario from a physical system motivated by
glaciology is used to examine the speed and accuracy of the computational
methods used, in addition to the viability of modeling assumptions. We conclude
by discussing how the model and associated methodology can be applied in other
physical contexts besides glaciology.Comment: Revision accepted for publication by the Journal of Agricultural,
Biological, and Environmental Statistic
Ratings and rankings: Voodoo or Science?
Composite indicators aggregate a set of variables using weights which are
understood to reflect the variables' importance in the index. In this paper we
propose to measure the importance of a given variable within existing composite
indicators via Karl Pearson's `correlation ratio'; we call this measure `main
effect'. Because socio-economic variables are heteroskedastic and correlated,
(relative) nominal weights are hardly ever found to match (relative) main
effects; we propose to summarize their discrepancy with a divergence measure.
We further discuss to what extent the mapping from nominal weights to main
effects can be inverted. This analysis is applied to five composite indicators,
including the Human Development Index and two popular league tables of
university performance. It is found that in many cases the declared importance
of single indicators and their main effect are very different, and that the
data correlation structure often prevents developers from obtaining the stated
importance, even when modifying the nominal weights in the set of nonnegative
numbers with unit sum.Comment: 28 pages, 7 figure
FastMMD: Ensemble of Circular Discrepancy for Efficient Two-Sample Test
The maximum mean discrepancy (MMD) is a recently proposed test statistic for
two-sample test. Its quadratic time complexity, however, greatly hampers its
availability to large-scale applications. To accelerate the MMD calculation, in
this study we propose an efficient method called FastMMD. The core idea of
FastMMD is to equivalently transform the MMD with shift-invariant kernels into
the amplitude expectation of a linear combination of sinusoid components based
on Bochner's theorem and Fourier transform (Rahimi & Recht, 2007). Taking
advantage of sampling of Fourier transform, FastMMD decreases the time
complexity for MMD calculation from to , where and
are the size and dimension of the sample set, respectively. Here is the
number of basis functions for approximating kernels which determines the
approximation accuracy. For kernels that are spherically invariant, the
computation can be further accelerated to by using the Fastfood
technique (Le et al., 2013). The uniform convergence of our method has also
been theoretically proved in both unbiased and biased estimates. We have
further provided a geometric explanation for our method, namely ensemble of
circular discrepancy, which facilitates us to understand the insight of MMD,
and is hopeful to help arouse more extensive metrics for assessing two-sample
test. Experimental results substantiate that FastMMD is with similar accuracy
as exact MMD, while with faster computation speed and lower variance than the
existing MMD approximation methods
A multi-level algorithm for the solution of moment problems
We study numerical methods for the solution of general linear moment
problems, where the solution belongs to a family of nested subspaces of a
Hilbert space. Multi-level algorithms, based on the conjugate gradient method
and the Landweber--Richardson method are proposed that determine the "optimal"
reconstruction level a posteriori from quantities that arise during the
numerical calculations. As an important example we discuss the reconstruction
of band-limited signals from irregularly spaced noisy samples, when the actual
bandwidth of the signal is not available. Numerical examples show the
usefulness of the proposed algorithms
A Linear-Time Kernel Goodness-of-Fit Test
We propose a novel adaptive test of goodness-of-fit, with computational cost
linear in the number of samples. We learn the test features that best indicate
the differences between observed samples and a reference model, by minimizing
the false negative rate. These features are constructed via Stein's method,
meaning that it is not necessary to compute the normalising constant of the
model. We analyse the asymptotic Bahadur efficiency of the new test, and prove
that under a mean-shift alternative, our test always has greater relative
efficiency than a previous linear-time kernel test, regardless of the choice of
parameters for that test. In experiments, the performance of our method exceeds
that of the earlier linear-time test, and matches or exceeds the power of a
quadratic-time kernel test. In high dimensions and where model structure may be
exploited, our goodness of fit test performs far better than a quadratic-time
two-sample test based on the Maximum Mean Discrepancy, with samples drawn from
the model.Comment: Accepted to NIPS 201
A model of the human observer and decision maker
The decision process is described in terms of classical sequential decision theory by considering the hypothesis that an abnormal condition has occurred by means of a generalized likelihood ratio test. For this, a sufficient statistic is provided by the innovation sequence which is the result of the perception an information processing submodel of the human observer. On the basis of only two model parameters, the model predicts the decision speed/accuracy trade-off and various attentional characteristics. A preliminary test of the model for single variable failure detection tasks resulted in a very good fit of the experimental data. In a formal validation program, a variety of multivariable failure detection tasks was investigated and the predictive capability of the model was demonstrated
- …