75,950 research outputs found
Classification of local stellar populations: the improved MEMPHIS algorithm - Part II
Discontinuities of the local velocity distribution which are associated with stellar populations
are studied from the improved statistical method MEMPHIS (Maximum Entropy of
the Mixture Probability from HIerarchical Segregation), by combining a sampling parameter,
optimisation of the mixture approach, and maximum partition entropy of populations
composing the stellar sample. The sampling parameter is associated with isolating integrals
of the star motion and it is used to build a hierarchical family of subsamples. An accurate
characterisation of the entropy graph is given where a local maximum of entropy takes place
simultaneously with a local minimum 2 error. By working from different sampling parameters
the method is applied to samples from HIPPARCOS and Geneva-Copenhagen survey
(GCS) to obtain kinematic parameters and mixture proportions of thin disk, thick disk and
halo. The sampling parameter P = |(U, V,W)|, absolute heliocentric velocity, allows to build
an optimal subsample containing thin and thick disk stars, by leaving aside most of the halo
population. The sampling parameter P = |W|, absolute perpendicular velocity, is able to
build an optimal subsample containing a mixture of total disk and halo stars, although it
does not allow an optimal segregation of thin and thick disks. Other sampling parameters
like P = |(U,W)| or P = |V | are found to be less population informative. By comparing
both samples, HIPPARCOS provides more accurate estimates for thick disk and halo, while
GCS does for the total disk. In particular, the radial velocity dispersion of the halo fits
perfectly into the empirical Titius-Bode like law U = 6.6 ( 4
3 )3n+2, which was previously proposed
for discrete kinematic components, where the values n = 0, 1, 2, 3 stands for early-type
stars, thin disk, thick disk, and halo populations. Population statistics are used to segregate
thin disk, thick disk, and halo, and to obtain a more accurate bayesian estimation of the
population fractions.Preprin
A New Robust Regression Method Based on Minimization of Geodesic Distances on a Probabilistic Manifold: Application to Power Laws
In regression analysis for deriving scaling laws that occur in various
scientific disciplines, usually standard regression methods have been applied,
of which ordinary least squares (OLS) is the most popular. In many situations,
the assumptions underlying OLS are not fulfilled, and several other approaches
have been proposed. However, most techniques address only part of the
shortcomings of OLS. We here discuss a new and more general regression method,
which we call geodesic least squares regression (GLS). The method is based on
minimization of the Rao geodesic distance on a probabilistic manifold. For the
case of a power law, we demonstrate the robustness of the method on synthetic
data in the presence of significant uncertainty on both the data and the
regression model. We then show good performance of the method in an application
to a scaling law in magnetic confinement fusion.Comment: Published in Entropy. This is an extended version of our paper at the
34th International Workshop on Bayesian Inference and Maximum Entropy Methods
in Science and Engineering (MaxEnt 2014), 21-26 September 2014, Amboise,
Franc
Quantifying dependencies for sensitivity analysis with multivariate input sample data
We present a novel method for quantifying dependencies in multivariate
datasets, based on estimating the R\'{e}nyi entropy by minimum spanning trees
(MSTs). The length of the MSTs can be used to order pairs of variables from
strongly to weakly dependent, making it a useful tool for sensitivity analysis
with dependent input variables. It is well-suited for cases where the input
distribution is unknown and only a sample of the inputs is available. We
introduce an estimator to quantify dependency based on the MST length, and
investigate its properties with several numerical examples. To reduce the
computational cost of constructing the exact MST for large datasets, we explore
methods to compute approximations to the exact MST, and find the multilevel
approach introduced recently by Zhong et al. (2015) to be the most accurate. We
apply our proposed method to an artificial testcase based on the Ishigami
function, as well as to a real-world testcase involving sediment transport in
the North Sea. The results are consistent with prior knowledge and heuristic
understanding, as well as with variance-based analysis using Sobol indices in
the case where these indices can be computed
- …