12,780 research outputs found

    Improving Inference of Gaussian Mixtures Using Auxiliary Variables

    Full text link
    Expanding a lower-dimensional problem to a higher-dimensional space and then projecting back is often beneficial. This article rigorously investigates this perspective in the context of finite mixture models, namely how to improve inference for mixture models by using auxiliary variables. Despite the large literature in mixture models and several empirical examples, there is no previous work that gives general theoretical justification for including auxiliary variables in mixture models, even for special cases. We provide a theoretical basis for comparing inference for mixture multivariate models with the corresponding inference for marginal univariate mixture models. Analytical results for several special cases are established. We show that the probability of correctly allocating mixture memberships and the information number for the means of the primary outcome in a bivariate model with two Gaussian mixtures are generally larger than those in each univariate model. Simulations under a range of scenarios, including misspecified models, are conducted to examine the improvement. The method is illustrated by two real applications in ecology and causal inference

    Modeling and predicting market risk with Laplace-Gaussian mixture distributions

    Get PDF
    While much of classical statistical analysis is based on Gaussian distributional assumptions, statistical modeling with the Laplace distribution has gained importance in many applied fields. This phenomenon is rooted in the fact that, like the Gaussian, the Laplace distribution has many attractive properties. This paper investigates two methods of combining them and their use in modeling and predicting financial risk. Based on 25 daily stock return series, the empirical results indicate that the new models offer a plausible description of the data. They are also shown to be competitive with, or superior to, use of the hyperbolic distribution, which has gained some popularity in asset-return modeling and, in fact, also nests the Gaussian and Laplace. Klassifikation: C16, C50 . March 2005

    Active Learning with Statistical Models

    Get PDF
    For many types of machine learning algorithms, one can compute the statistically `optimal' way to select training data. In this paper, we review how optimal data selection techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are computationally expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate. Empirically, we observe that the optimality criterion sharply decreases the number of training examples the learner needs in order to achieve good performance.Comment: See http://www.jair.org/ for any accompanying file

    Standardization of multivariate Gaussian mixture models and background adjustment of PET images in brain oncology

    Full text link
    In brain oncology, it is routine to evaluate the progress or remission of the disease based on the differences between a pre-treatment and a post-treatment Positron Emission Tomography (PET) scan. Background adjustment is necessary to reduce confounding by tissue-dependent changes not related to the disease. When modeling the voxel intensities for the two scans as a bivariate Gaussian mixture, background adjustment translates into standardizing the mixture at each voxel, while tumor lesions present themselves as outliers to be detected. In this paper, we address the question of how to standardize the mixture to a standard multivariate normal distribution, so that the outliers (i.e., tumor lesions) can be detected using a statistical test. We show theoretically and numerically that the tail distribution of the standardized scores is favorably close to standard normal in a wide range of scenarios while being conservative at the tails, validating voxelwise hypothesis testing based on standardized scores. To address standardization in spatially heterogeneous image data, we propose a spatial and robust multivariate expectation-maximization (EM) algorithm, where prior class membership probabilities are provided by transformation of spatial probability template maps and the estimation of the class mean and covariances are robust to outliers. Simulations in both univariate and bivariate cases suggest that standardized scores with soft assignment have tail probabilities that are either very close to or more conservative than standard normal. The proposed methods are applied to a real data set from a PET phantom experiment, yet they are generic and can be used in other contexts
    • …
    corecore