141,155 research outputs found

    Maximum Fidelity

    Full text link
    The most fundamental problem in statistics is the inference of an unknown probability distribution from a finite number of samples. For a specific observed data set, answers to the following questions would be desirable: (1) Estimation: Which candidate distribution provides the best fit to the observed data?, (2) Goodness-of-fit: How concordant is this distribution with the observed data?, and (3) Uncertainty: How concordant are other candidate distributions with the observed data? A simple unified approach for univariate data that addresses these traditionally distinct statistical notions is presented called "maximum fidelity". Maximum fidelity is a strict frequentist approach that is fundamentally based on model concordance with the observed data. The fidelity statistic is a general information measure based on the coordinate-independent cumulative distribution and critical yet previously neglected symmetry considerations. An approximation for the null distribution of the fidelity allows its direct conversion to absolute model concordance (p value). Fidelity maximization allows identification of the most concordant model distribution, generating a method for parameter estimation, with neighboring, less concordant distributions providing the "uncertainty" in this estimate. Maximum fidelity provides an optimal approach for parameter estimation (superior to maximum likelihood) and a generally optimal approach for goodness-of-fit assessment of arbitrary models applied to univariate data. Extensions to binary data, binned data, multidimensional data, and classical parametric and nonparametric statistical tests are described. Maximum fidelity provides a philosophically consistent, robust, and seemingly optimal foundation for statistical inference. All findings are presented in an elementary way to be immediately accessible to all researchers utilizing statistical analysis.Comment: 66 pages, 32 figures, 7 tables, submitte

    Identifying Mixtures of Mixtures Using Bayesian Estimation

    Get PDF
    The use of a finite mixture of normal distributions in model-based clustering allows to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition this prior allows to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semi-parametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark data sets.Comment: 49 page

    Global axis shape of magnetic clouds deduced from the distribution of their local axis orientation

    Get PDF
    Coronal mass ejections (CMEs) are routinely tracked with imagers in the interplanetary space while magnetic clouds (MCs) properties are measured locally by spacecraft. However, both imager and insitu data do not provide direct estimation on the global flux rope properties. The main aim of this study is to constrain the global shape of the flux rope axis from local measurements, and to compare the results from in-situ data with imager observations. We perform a statistical analysis of the set of MCs observed by WIND spacecraft over 15 years in the vicinity of Earth. With the hypothesis of having a sample of MCs with a uniform distribution of spacecraft crossing along their axis, we show that a mean axis shape can be derived from the distribution of the axis orientation. In complement, while heliospheric imagers do not typically observe MCs but only their sheath region, we analyze one event where the flux-rope axis can be estimated from the STEREO imagers. From the analysis of a set of theoretical models, we show that the distribution of the local axis orientation is strongly affected by the global axis shape. Next, we derive the mean axis shape from the integration of the observed orientation distribution. This shape is robust as it is mostly determined from the global shape of the distribution. Moreover, we find no dependence on the flux-rope inclination on the ecliptic. Finally, the derived shape is fully consistent with the one derived from heliospheric imager observations of the June 2008 event. We have derived a mean shape of MC axis which only depends on one free parameter, the angular separation of the legs (as viewed from the Sun). This mean shape can be used in various contexts such as the study of high energy particles or space weather forecast.Comment: 13 pages, 12 figure
    corecore