7 research outputs found

    The gamma distribution as an alternative to the lognormal distribution in environmental applications

    Full text link
    In environmental applications dealing with data from contaminated sites the positively skewed lognormal distribution has been the most commonly used model. The upper confidence limit (UCL) of the arithmetic mean of a lognormal population is computed by using the H-statistics. Recent concerns have arisen to the effectiveness of the H-Statistic based UCL for the mean of the lognormal distribution in instances of moderately to highly skewed data sets. In this paper the positively skewed Gamma distribution is considered as an alternative to the lognormal distribution and is shown to produce more reasonable UCL\u27s for the mean

    Clustering High Dimensional Sparse Casino Player Tracking Datasets

    Full text link
    In this article, we propose an iterative procedure for clustering sparse high dimensional transaction datasets, specifically two casino player tracking datasets. a common problem in clustering sparse datasets with very large dimensions is that in addition to classical techniques of clustering being unable to provide useful results, latent variable methods used for clustering often do not lead to sufficient data reduction to yield useful and informative results either. initially, we propose a straightforward resorting of the full dataset and then define an information based sparsity index to subset the sorted data. this new dimension reduced dataset is less sparse, and thus, more likely to produce meaningful results using established techniques for clustering. Using this technique enables the clustering of two secondary datasets from two Las Vegas repeater market casino properties, which consist of the amount of money casino patrons gambled, termed coin-in, on a variety of slot machines

    Robust multivariate association and dimension reduction using density divergences

    Get PDF
    In this article, we introduce two new families of multivariate association measures based on power divergence and alpha divergence that recover both linear and nonlinear dependence relationships between multiple sets of random vectors. Importantly, this novel approach not only characterizes independence, but also provides a smooth bridge between well-known distances that are inherently robust against outliers. Algorithmic approaches are developed for dimension reduction and the selection of the optimal robust association index. Extensive simulation studies are performed to assess the robustness of these association measures under different types and proportions of contamination. We illustrate the usefulness of our methods in application by analyzing two socioeconomic datasets that are known to contain outliers or extreme observations. Some theoretical properties, including the consistency of the estimated coefficient vectors, are investigated and computationally efficient algorithms for our nonparametric methods are provided. (C) 2013 Elsevier Inc. All rights reserved

    The Dual Central Subspaces in dimension reduction

    Get PDF
    Existing dimension reduction methods in multivariate analysis have focused on reducing sets of random vectors into equivalently sized dimensions, while methods in regression settings have focused mainly on decreasing the dimension of the predictor variables. However, for problems involving a multivariate response, reducing the dimension of the response vector is also desirable and important. In this paper, we develop a new concept, termed the Dual Central Subspaces (DCS), to produce a method for simultaneously reducing the dimensions of two sets of random vectors, irrespective of the labels predictor and response. Different from previous methods based on extensions of Canonical Correlation Analysis (CCA), the recovery of this subspace provides a new research direction for multivariate sufficient dimension reduction. A particular model-free approach is detailed theoretically and the performance investigated through simulation and a real data analysis. (C) 2015 Elsevier Inc. All rights reserved
    corecore