2,444 research outputs found
Multivariate texture discrimination based on geodesics to class centroids on a generalized Gaussian Manifold
A texture discrimination scheme is proposed wherein probability distributions are deployed on a probabilistic manifold for modeling the wavelet statistics of images. We consider the Rao geodesic distance (GD) to the class centroid for texture discrimination in various classification experiments. We compare the performance of GD to class centroid with the Euclidean distance in a similar context, both in terms of accuracy and computational complexity. Also, we compare our proposed classification scheme with the k-nearest neighbor algorithm. Univariate and multivariate Gaussian and Laplace distributions, as well as generalized Gaussian distributions with variable shape parameter are each evaluated as a statistical model for the wavelet coefficients. The GD to the centroid outperforms the Euclidean distance and yields superior discrimination compared to the k-nearest neighbor approach
Recent advances in directional statistics
Mainstream statistical methodology is generally applicable to data observed
in Euclidean space. There are, however, numerous contexts of considerable
scientific interest in which the natural supports for the data under
consideration are Riemannian manifolds like the unit circle, torus, sphere and
their extensions. Typically, such data can be represented using one or more
directions, and directional statistics is the branch of statistics that deals
with their analysis. In this paper we provide a review of the many recent
developments in the field since the publication of Mardia and Jupp (1999),
still the most comprehensive text on directional statistics. Many of those
developments have been stimulated by interesting applications in fields as
diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics,
image analysis, text mining, environmetrics, and machine learning. We begin by
considering developments for the exploratory analysis of directional data
before progressing to distributional models, general approaches to inference,
hypothesis testing, regression, nonparametric curve estimation, methods for
dimension reduction, classification and clustering, and the modelling of time
series, spatial and spatio-temporal data. An overview of currently available
software for analysing directional data is also provided, and potential future
developments discussed.Comment: 61 page
lp-Recovery of the Most Significant Subspace among Multiple Subspaces with Outliers
We assume data sampled from a mixture of d-dimensional linear subspaces with
spherically symmetric distributions within each subspace and an additional
outlier component with spherically symmetric distribution within the ambient
space (for simplicity we may assume that all distributions are uniform on their
corresponding unit spheres). We also assume mixture weights for the different
components. We say that one of the underlying subspaces of the model is most
significant if its mixture weight is higher than the sum of the mixture weights
of all other subspaces. We study the recovery of the most significant subspace
by minimizing the lp-averaged distances of data points from d-dimensional
subspaces, where p>0. Unlike other lp minimization problems, this minimization
is non-convex for all p>0 and thus requires different methods for its analysis.
We show that if 0<p<=1, then for any fraction of outliers the most significant
subspace can be recovered by lp minimization with overwhelming probability
(which depends on the generating distribution and its parameters). We show that
when adding small noise around the underlying subspaces the most significant
subspace can be nearly recovered by lp minimization for any 0<p<=1 with an
error proportional to the noise level. On the other hand, if p>1 and there is
more than one underlying subspace, then with overwhelming probability the most
significant subspace cannot be recovered or nearly recovered. This last result
does not require spherically symmetric outliers.Comment: This is a revised version of the part of 1002.1994 that deals with
single subspace recovery. V3: Improved estimates (in particular for Lemma 3.1
and for estimates relying on it), asymptotic dependence of probabilities and
constants on D and d and further clarifications; for simplicity it assumes
uniform distributions on spheres. V4: minor revision for the published
versio
Doctor of Philosophy in Computing
dissertationAn important area of medical imaging research is studying anatomical diffeomorphic shape changes and detecting their relationship to disease processes. For example, neurodegenerative disorders change the shape of the brain, thus identifying differences between the healthy control subjects and patients affected by these diseases can help with understanding the disease processes. Previous research proposed a variety of mathematical approaches for statistical analysis of geometrical brain structure in three-dimensional (3D) medical imaging, including atlas building, brain variability quantification, regression, etc. The critical component in these statistical models is that the geometrical structure is represented by transformations rather than the actual image data. Despite the fact that such statistical models effectively provide a way for analyzing shape variation, none of them have a truly probabilistic interpretation. This dissertation contributes a novel Bayesian framework of statistical shape analysis for generic manifold data and its application to shape variability and brain magnetic resonance imaging (MRI). After we carefully define the distributions on manifolds, we then build Bayesian models for analyzing the intrinsic variability of manifold data, involving the mean point, principal modes, and parameter estimation. Because there is no closed-form solution for Bayesian inference of these models on manifolds, we develop a Markov Chain Monte Carlo method to sample the hidden variables from the distribution. The main advantages of these Bayesian approaches are that they provide parameter estimation and automatic dimensionality reduction for analyzing generic manifold-valued data, such as diffeomorphisms. Modeling the mean point of a group of images in a Bayesian manner allows for learning the regularity parameter from data directly rather than having to set it manually, which eliminates the effort of cross validation for parameter selection. In population studies, our Bayesian model of principal modes analysis (1) automatically extracts a low-dimensional, second-order statistics of manifold data variability and (2) gives a better geometric data fit than nonprobabilistic models. To make this Bayesian framework computationally more efficient for high-dimensional diffeomorphisms, this dissertation presents an algorithm, FLASH (finite-dimensional Lie algebras for shooting), that hugely speeds up the diffeomorphic image registration. Instead of formulating diffeomorphisms in a continuous variational problem, Flash defines a completely new discrete reparameterization of diffeomorphisms in a low-dimensional bandlimited velocity space, which results in the Bayesian inference via sampling on the space of diffeomorphisms being more feasible in time. Our entire Bayesian framework in this dissertation is used for statistical analysis of shape data and brain MRIs. It has the potential to improve hypothesis testing, classification, and mixture models
- …