45 research outputs found

    Symmetric measures via moments

    Full text link
    Algebraic tools in statistics have recently been receiving special attention and a number of interactions between algebraic geometry and computational statistics have been rapidly developing. This paper presents another such connection, namely, one between probabilistic models invariant under a finite group of (non-singular) linear transformations and polynomials invariant under the same group. Two specific aspects of the connection are discussed: generalization of the (uniqueness part of the multivariate) problem of moments and log-linear, or toric, modeling by expansion of invariant terms. A distribution of minuscule subimages extracted from a large database of natural images is analyzed to illustrate the above concepts.Comment: Published in at http://dx.doi.org/10.3150/07-BEJ6144 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    A generalized risk approach to path inference based on hidden Markov models

    Full text link
    Motivated by the unceasing interest in hidden Markov models (HMMs), this paper re-examines hidden path inference in these models, using primarily a risk-based framework. While the most common maximum a posteriori (MAP), or Viterbi, path estimator and the minimum error, or Posterior Decoder (PD), have long been around, other path estimators, or decoders, have been either only hinted at or applied more recently and in dedicated applications generally unfamiliar to the statistical learning community. Over a decade ago, however, a family of algorithmically defined decoders aiming to hybridize the two standard ones was proposed (Brushe et al., 1998). The present paper gives a careful analysis of this hybridization approach, identifies several problems and issues with it and other previously proposed approaches, and proposes practical resolutions of those. Furthermore, simple modifications of the classical criteria for hidden path recognition are shown to lead to a new class of decoders. Dynamic programming algorithms to compute these decoders in the usual forward-backward manner are presented. A particularly interesting subclass of such estimators can be also viewed as hybrids of the MAP and PD estimators. Similar to previously proposed MAP-PD hybrids, the new class is parameterized by a small number of tunable parameters. Unlike their algorithmic predecessors, the new risk-based decoders are more clearly interpretable, and, most importantly, work "out of the box" in practice, which is demonstrated on some real bioinformatics tasks and data. Some further generalizations and applications are discussed in conclusion.Comment: Section 5: corrected denominators of the scaled beta variables (pp. 27-30), => corrections in claims 1, 3, Prop. 12, bottom of Table 1. Decoder (49), Corol. 14 are generalized to handle 0 probabilities. Notation is more closely aligned with (Bishop, 2006). Details are inserted in eqn-s (43); the positivity assumption in Prop. 11 is explicit. Fixed typing errors in equation (41), Example

    MAP segmentation in Bayesian hidden Markov models:a case study

    Get PDF
    We consider the problem of estimating the maximum posterior probability (MAP) state sequence for a finite state and finite emission alphabet hidden Markov model (HMM) in the Bayesian setup, where both emission and transition matrices have Dirichlet priors. We study a training set consisting of thousands of protein alignment pairs. The training data is used to set the prior hyperparameters for Bayesian MAP segmentation. Since the Viterbi algorithm is not applicable any more, there is no simple procedure to find the MAP path, and several iterative algorithms are considered and compared. The main goal of the paper is to test the Bayesian setup against the frequentist one, where the parameters of HMM are estimated using the training data

    Estimation of Viterbi path in Bayesian hidden Markov models

    Get PDF
    The article studies different methods for estimating the Viterbi path in the Bayesian framework. The Viterbi path is an estimate of the underlying state path in hidden Markov models (HMMs), which has a maximum joint posterior probability. Hence it is also called the maximum a posteriori (MAP) path. For an HMM with given parameters, the Viterbi path can be easily found with the Viterbi algorithm. In the Bayesian framework the Viterbi algorithm is not applicable and several iterative methods can be used instead. We introduce a new EM-type algorithm for finding the MAP path and compare it with various other methods for finding the MAP path, including the variational Bayes approach and MCMC methods. Examples with simulated data are used to compare the performance of the methods. The main focus is on non-stochastic iterative methods and our results show that the best of those methods work as well or better than the best MCMC methods. Our results demonstrate that when the primary goal is segmentation, then it is more reasonable to perform segmentation directly by considering the transition and emission parameters as nuisance parameters.Peer reviewe

    Non-Euclidean statistics for covariance matrices with applications to diffusion tensor imaging

    Get PDF
    The statistical analysis of covariance matrix data is considered and, in particular, methodology is discussed which takes into account the nonEuclidean nature of the space of positive semi-definite symmetric matrices. The main motivation for the work is the analysis of diffusion tensors in medical image analysis. The primary focus is on estimation of a mean covariance matrix and, in particular, on the use of Procrustes size-and-shape space. Comparisons are made with other estimation techniques, including using the matrix logarithm, matrix square root and Cholesky decomposition. Applications to diffusion tensor imaging are considered and, in particular, a new measure of fractional anisotropy called Procrustes Anisotropy is discussed

    Procrustes analysis for diffusion tensor image processing

    Get PDF
    There is an increasing need to develop processing tools for diffusion tensor image data with the consideration of the non-Euclidean nature of the tensor space. In this paper Procrustes analysis, a non-Euclidean shape analysis tool under similarity transformations (rotation, scaling and translation), is proposed to redefine sample statistics of diffusion tensors. A new anisotropy measure Procrustes Anisotropy (PA) is defined with the full ordinary Procrustes analysis. Comparisons are made with other anisotropy measures including Fractional Anisotropy and Geodesic Anisotropy. The partial generalized Procrustes analysis is extended to a weighted generalized Procrustes framework for averaging sample tensors with different fractions of contributions to the mean tensor. Applications of Procrustes methods to diffusion tensor interpolation and smoothing are compared with Euclidean, Log-Euclidean and Riemannian methods