156 research outputs found

    Empirical geodesic graphs and CAT(k) metrics for data analysis

    Full text link
    A methodology is developed for data analysis based on empirically constructed geodesic metric spaces. For a probability distribution, the length along a path between two points can be defined as the amount of probability mass accumulated along the path. The geodesic, then, is the shortest such path and defines a geodesic metric. Such metrics are transformed in a number of ways to produce parametrised families of geodesic metric spaces, empirical versions of which allow computation of intrinsic means and associated measures of dispersion. These reveal properties of the data, based on geometry, such as those that are difficult to see from the raw Euclidean distances. Examples of application include clustering and classification. For certain parameter ranges, the spaces become CAT(0) spaces and the intrinsic means are unique. In one case, a minimal spanning tree of a graph based on the data becomes CAT(0). In another, a so-called "metric cone" construction allows extension to CAT(kk) spaces. It is shown how to empirically tune the parameters of the metrics, making it possible to apply them to a number of real cases.Comment: Statistics and Computing, 201

    Approximation of probability density functions for PDEs with random parameters using truncated series expansions

    Full text link
    The probability density function (PDF) of a random variable associated with the solution of a partial differential equation (PDE) with random parameters is approximated using a truncated series expansion. The random PDE is solved using two stochastic finite element methods, Monte Carlo sampling and the stochastic Galerkin method with global polynomials. The random variable is a functional of the solution of the random PDE, such as the average over the physical domain. The truncated series are obtained considering a finite number of terms in the Gram-Charlier or Edgeworth series expansions. These expansions approximate the PDF of a random variable in terms of another PDF, and involve coefficients that are functions of the known cumulants of the random variable. To the best of our knowledge, their use in the framework of PDEs with random parameters has not yet been explored

    "Building" exact confidence nets

    Get PDF
    Confidence nets, that is, collections of confidence intervals that fill out the parameter space and whose exact parameter coverage can be computed, are familiar in nonparametric statistics. Here, the distributional assumptions are based on invariance under the action of a finite reflection group. Exact confidence nets are exhibited for a single parameter, based on the root system of the group. The main result is a formula for the generating function of the coverage interval probabilities. The proof makes use of the theory of "buildings" and the Chevalley factorization theorem for the length distribution on Cayley graphs of finite reflection groups.Comment: 20 pages. To appear in Bernoull

    The algebraic method in quadrature for uncertainty quantification

    Get PDF
    A general method of quadrature for uncertainty quantification (UQ) is introduced based on the algebraic method in experimental design. This is a method based on the theory of zero-dimensional algebraic varieties. It allows quadrature of polynomials or polynomial approximands for quite general sets of quadrature points, here called “designs.” The method goes some way to explaining when quadrature weights are nonnegative and gives exact quadrature for monomials in the quotient ring defined by the algebraic method. The relationship to the classical methods based on zeros of orthogonal polynomials is discussed, and numerical comparisons are made with methods such as Gaussian quadrature and Smolyak grids. Application to UQ is examined in the context of polynomial chaos expansion and the probabilistic collocation method, where solution statistics are estimated

    (U,V)-Ordering and a Duality Theorem for Risk Aversion and Lorenz-type Orderings

    Get PDF
    There is a duality theory connecting certain stochastic orderings between cumulative distribution functions F_1,F_2 and stochastic orderings between their inverses F_1^(-1),F_2^(-1). This underlies some theories of utility in the case of the cdf and deprivation indices in the case of the inverse. Under certain conditions there is an equivalence between the two theories. An example is the equivalence between second order stochastic dominance and the Lorenz ordering. This duality is generalised to include the case where there is "distortion" of the cdf of the form v(F) and also of the inverse. A comprehensive duality theorem is presented in a form which includes the distortions and links the duality to the parallel theories of risk and deprivation indices. It is shown that some well-known examples are special cases of the results, including some from the Yaari social welfare theory and the theory of majorization.Comment: 23 pages, no figures, 2 Appendice

    Bregman divergences based on optimal design criteria and simplicial measures of dispersion

    Get PDF
    In previous work the authors defined the k-th order simplicial distance between probability distributions which arises naturally from a measure of dispersion based on the squared volume of random simplices of dimension k. This theory is embedded in the wider theory of divergences and distances between distributions which includes Kullback–Leibler, Jensen–Shannon, Jeffreys–Bregman divergence and Bhattacharyya distance. A general construction is given based on defining a directional derivative of a function ϕ from one distribution to the other whose concavity or strict concavity influences the properties of the resulting divergence. For the normal distribution these divergences can be expressed as matrix formula for the (multivariate) means and covariances. Optimal experimental design criteria contribute a range of functionals applied to non-negative, or positive definite, information matrices. Not all can distinguish normal distributions but sufficient conditions are given. The k-th order simplicial distance is revisited from this aspect and the results are used to test empirically the identity of means and covariances

    Extended generalised variances, with applications

    Get PDF
    We consider a measure ψk of dispersion which extends the notion of Wilk’s generalised variance for a d-dimensional distribution, and is based on the mean squared volume of simplices of dimension k≀d formed by k+1 independent copies. We show how ψk can be expressed in terms of the eigenvalues of the covariance matrix of the distribution, also when a n-point sample is used for its estimation, and prove its concavity when raised at a suitable power. Some properties of dispersion-maximising distributions are derived, including a necessary and sufficient condition for optimality. Finally, we show how this measure of dispersion can be used for the design of optimal experiments, with equivalence to A and D-optimal design for k=1 and k=d, respectively. Simple illustrative examples are presented

    The algebraic method in tree percolation

    Get PDF
    We apply the methods of algebraic reliability to the study of percolation on trees. To a complete kk-ary tree Tk,nT_{k,n} of depth nn we assign a monomial ideal Ik,nI_{k,n} on ∑i=1nki\sum_{i=1}^n k^i variables and knk^n minimal monomial generators. We give explicit recursive formulae for the Betti numbers of Ik,nI_{k,n} and their Hilbert series, which allow us to study explicitly percolation on Tk,nT_{k,n}. We study bounds on this percolation and study its asymptotical behavior with the mentioned commutative algebra techniques
    • 

    corecore