141,155 research outputs found
Maximum Fidelity
The most fundamental problem in statistics is the inference of an unknown
probability distribution from a finite number of samples. For a specific
observed data set, answers to the following questions would be desirable: (1)
Estimation: Which candidate distribution provides the best fit to the observed
data?, (2) Goodness-of-fit: How concordant is this distribution with the
observed data?, and (3) Uncertainty: How concordant are other candidate
distributions with the observed data? A simple unified approach for univariate
data that addresses these traditionally distinct statistical notions is
presented called "maximum fidelity". Maximum fidelity is a strict frequentist
approach that is fundamentally based on model concordance with the observed
data. The fidelity statistic is a general information measure based on the
coordinate-independent cumulative distribution and critical yet previously
neglected symmetry considerations. An approximation for the null distribution
of the fidelity allows its direct conversion to absolute model concordance (p
value). Fidelity maximization allows identification of the most concordant
model distribution, generating a method for parameter estimation, with
neighboring, less concordant distributions providing the "uncertainty" in this
estimate. Maximum fidelity provides an optimal approach for parameter
estimation (superior to maximum likelihood) and a generally optimal approach
for goodness-of-fit assessment of arbitrary models applied to univariate data.
Extensions to binary data, binned data, multidimensional data, and classical
parametric and nonparametric statistical tests are described. Maximum fidelity
provides a philosophically consistent, robust, and seemingly optimal foundation
for statistical inference. All findings are presented in an elementary way to
be immediately accessible to all researchers utilizing statistical analysis.Comment: 66 pages, 32 figures, 7 tables, submitte
Identifying Mixtures of Mixtures Using Bayesian Estimation
The use of a finite mixture of normal distributions in model-based clustering
allows to capture non-Gaussian data clusters. However, identifying the clusters
from the normal components is challenging and in general either achieved by
imposing constraints on the model or by using post-processing procedures.
Within the Bayesian framework we propose a different approach based on sparse
finite mixtures to achieve identifiability. We specify a hierarchical prior
where the hyperparameters are carefully selected such that they are reflective
of the cluster structure aimed at. In addition this prior allows to estimate
the model using standard MCMC sampling methods. In combination with a
post-processing approach which resolves the label switching issue and results
in an identified model, our approach allows to simultaneously (1) determine the
number of clusters, (2) flexibly approximate the cluster distributions in a
semi-parametric way using finite mixtures of normals and (3) identify
cluster-specific parameters and classify observations. The proposed approach is
illustrated in two simulation studies and on benchmark data sets.Comment: 49 page
Global axis shape of magnetic clouds deduced from the distribution of their local axis orientation
Coronal mass ejections (CMEs) are routinely tracked with imagers in the
interplanetary space while magnetic clouds (MCs) properties are measured
locally by spacecraft. However, both imager and insitu data do not provide
direct estimation on the global flux rope properties. The main aim of this
study is to constrain the global shape of the flux rope axis from local
measurements, and to compare the results from in-situ data with imager
observations. We perform a statistical analysis of the set of MCs observed by
WIND spacecraft over 15 years in the vicinity of Earth. With the hypothesis of
having a sample of MCs with a uniform distribution of spacecraft crossing along
their axis, we show that a mean axis shape can be derived from the distribution
of the axis orientation. In complement, while heliospheric imagers do not
typically observe MCs but only their sheath region, we analyze one event where
the flux-rope axis can be estimated from the STEREO imagers. From the analysis
of a set of theoretical models, we show that the distribution of the local axis
orientation is strongly affected by the global axis shape. Next, we derive the
mean axis shape from the integration of the observed orientation distribution.
This shape is robust as it is mostly determined from the global shape of the
distribution. Moreover, we find no dependence on the flux-rope inclination on
the ecliptic. Finally, the derived shape is fully consistent with the one
derived from heliospheric imager observations of the June 2008 event. We have
derived a mean shape of MC axis which only depends on one free parameter, the
angular separation of the legs (as viewed from the Sun). This mean shape can be
used in various contexts such as the study of high energy particles or space
weather forecast.Comment: 13 pages, 12 figure
- …