40,368 research outputs found

    Classification of local stellar populations: the improved MEMPHIS algorithm - Part II

    Get PDF
    Discontinuities of the local velocity distribution which are associated with stellar populations are studied from the improved statistical method MEMPHIS (Maximum Entropy of the Mixture Probability from HIerarchical Segregation), by combining a sampling parameter, optimisation of the mixture approach, and maximum partition entropy of populations composing the stellar sample. The sampling parameter is associated with isolating integrals of the star motion and it is used to build a hierarchical family of subsamples. An accurate characterisation of the entropy graph is given where a local maximum of entropy takes place simultaneously with a local minimum 2 error. By working from different sampling parameters the method is applied to samples from HIPPARCOS and Geneva-Copenhagen survey (GCS) to obtain kinematic parameters and mixture proportions of thin disk, thick disk and halo. The sampling parameter P = |(U, V,W)|, absolute heliocentric velocity, allows to build an optimal subsample containing thin and thick disk stars, by leaving aside most of the halo population. The sampling parameter P = |W|, absolute perpendicular velocity, is able to build an optimal subsample containing a mixture of total disk and halo stars, although it does not allow an optimal segregation of thin and thick disks. Other sampling parameters like P = |(U,W)| or P = |V | are found to be less population informative. By comparing both samples, HIPPARCOS provides more accurate estimates for thick disk and halo, while GCS does for the total disk. In particular, the radial velocity dispersion of the halo fits perfectly into the empirical Titius-Bode like law U = 6.6 ( 4 3 )3n+2, which was previously proposed for discrete kinematic components, where the values n = 0, 1, 2, 3 stands for early-type stars, thin disk, thick disk, and halo populations. Population statistics are used to segregate thin disk, thick disk, and halo, and to obtain a more accurate bayesian estimation of the population fractions.Preprin

    A New Robust Regression Method Based on Minimization of Geodesic Distances on a Probabilistic Manifold: Application to Power Laws

    Get PDF
    In regression analysis for deriving scaling laws that occur in various scientific disciplines, usually standard regression methods have been applied, of which ordinary least squares (OLS) is the most popular. In many situations, the assumptions underlying OLS are not fulfilled, and several other approaches have been proposed. However, most techniques address only part of the shortcomings of OLS. We here discuss a new and more general regression method, which we call geodesic least squares regression (GLS). The method is based on minimization of the Rao geodesic distance on a probabilistic manifold. For the case of a power law, we demonstrate the robustness of the method on synthetic data in the presence of significant uncertainty on both the data and the regression model. We then show good performance of the method in an application to a scaling law in magnetic confinement fusion.Comment: Published in Entropy. This is an extended version of our paper at the 34th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering (MaxEnt 2014), 21-26 September 2014, Amboise, Franc

    Quantifying dependencies for sensitivity analysis with multivariate input sample data

    Get PDF
    We present a novel method for quantifying dependencies in multivariate datasets, based on estimating the R\'{e}nyi entropy by minimum spanning trees (MSTs). The length of the MSTs can be used to order pairs of variables from strongly to weakly dependent, making it a useful tool for sensitivity analysis with dependent input variables. It is well-suited for cases where the input distribution is unknown and only a sample of the inputs is available. We introduce an estimator to quantify dependency based on the MST length, and investigate its properties with several numerical examples. To reduce the computational cost of constructing the exact MST for large datasets, we explore methods to compute approximations to the exact MST, and find the multilevel approach introduced recently by Zhong et al. (2015) to be the most accurate. We apply our proposed method to an artificial testcase based on the Ishigami function, as well as to a real-world testcase involving sediment transport in the North Sea. The results are consistent with prior knowledge and heuristic understanding, as well as with variance-based analysis using Sobol indices in the case where these indices can be computed

    Are Slepian-Wolf Rates Necessary for Distributed Parameter Estimation?

    Full text link
    We consider a distributed parameter estimation problem, in which multiple terminals send messages related to their local observations using limited rates to a fusion center who will obtain an estimate of a parameter related to observations of all terminals. It is well known that if the transmission rates are in the Slepian-Wolf region, the fusion center can fully recover all observations and hence can construct an estimator having the same performance as that of the centralized case. One natural question is whether Slepian-Wolf rates are necessary to achieve the same estimation performance as that of the centralized case. In this paper, we show that the answer to this question is negative. We establish our result by explicitly constructing an asymptotically minimum variance unbiased estimator (MVUE) that has the same performance as that of the optimal estimator in the centralized case while requiring information rates less than the conditions required in the Slepian-Wolf rate region.Comment: Accepted in Allerton 201
    • …
    corecore