1,942 research outputs found
Partial membership latent Dirichlet allocation
Dissertation supervisor: Dr. Alina Zare.Includes vita.For many years, topic models (e.g., pLSA, LDA, SLDA) have been widely used for segmenting and recognizing objects in imagery simultaneously. However, these models are confined to the analysis of categorical data, forcing a visual word to belong to one and only one topic. There are many images in which some regions cannot be assigned a crisp categorical label (e.g., transition regions between a foggy sky and the ground or between sand and water at a beach). In these cases, a visual word is best represented with partial memberships across multiple topics. To address this, a partial membership latent Dirichlet allocation (PM-LDA) model and associated parameter estimation algorithm are present. PM-LDA defines a novel partial membership model for word and document generation. Different from the standard LDA model which assumes that each word belongs to one and only one topic, PM-LDA model allows words to have partial membership in multiple topics. This model can be useful for image[slash]video documents where a visual word (an image patch) may be a mixture of multiple topics. For example, in a SONAR imagery where the gradually vanishing sand ripples blur the boundary between sand ripple region and flat sand region, it is impossible to tell where the sand ripple ends and the flat sand starts. In the proposed PM-LDA model, the visual words are represented with partial memberships in both "sand ripple" and "flat sand" topics, which is more reasonable than assigning them to one and only one topic as in the standard LDA model. A Gibbs sampling is employed for parameter estimation. Experimental results on simulated data, SONAR image dataset and natural image datasets show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability existing methods do not have.Includes bibliographical references (pages 147-157)
Hyperspectral Unmixing with Endmember Variability using Partial Membership Latent Dirichlet Allocation
The application of Partial Membership Latent Dirichlet Allocation(PM-LDA) for
hyperspectral endmember estimation and spectral unmixing is presented. PM-LDA
provides a model for a hyperspectral image analysis that accounts for spectral
variability and incorporates spatial information through the use of
superpixel-based 'documents.' In our application of PM-LDA, we employ the
Normal Compositional Model in which endmembers are represented as Normal
distributions to account for spectral variability and proportion vectors are
modeled as random variables governed by a Dirichlet distribution. The use of
the Dirichlet distribution enforces positivity and sum-to-one constraints on
the proportion values. Algorithm results on real hyperspectral data indicate
that PM-LDA produces endmember distributions that represent the ground truth
classes and their associated variability
Map-guided hyperspectral image superpixel segmentation using semi-supervised partial membership latent Dirichlet allocation
Many superpixel segmentation algorithms which are suitable for the regular color images like images with three channels: red, green and blue (RGB images) have been developed in the literature. However, because of the high dimensionality of hyperspectral imagery, these regular superpixel segmentation algorithms often do not perform well in hyperspectral imagery. Although there are some authors who have modified some regular superpixel segmentation algorithms to fit the hyperspectral image, many still underperform on complex data. In this thesis, to solve this problem, we introduce a hyperspectral unmixing based superpixel segmentation that leverages map information. We call this approach map-guided semi-supervised PM-LDA superpixel segmentation. The approach uses auxilliary map information to guide segmentation. The approach also leverages spectral unmixing results to provide improved results compared with segmentation based on raw data. We test our proposed method on two real hyperspectral data, University of Pavia and MUUFL Gulfport Hyperspectral Data. In these experiments, our proposed method achieves better results compared to other state-of-the-art algorithms. We also develop new cluster validity metrics to evaluate the results
The latent process decomposition of cDNA microarray data sets
We present a new computational technique (a software implementation, data sets, and supplementary information are available at http://www.enm.bris.ac.uk/lpd/) which enables the probabilistic analysis of cDNA microarray data and we demonstrate its effectiveness in identifying features of biomedical importance. A hierarchical Bayesian model, called latent process decomposition (LPD), is introduced in which each sample in the data set is represented as a combinatorial mixture over a finite set of latent processes, which are expected to correspond to biological processes. Parameters in the model are estimated using efficient variational methods. This type of probabilistic model is most appropriate for the interpretation of measurement data generated by cDNA microarray technology. For determining informative substructure in such data sets, the proposed model has several important advantages over the standard use of dendrograms. First, the ability to objectively assess the optimal number of sample clusters. Second, the ability to represent samples and gene expression levels using a common set of latent variables (dendrograms cluster samples and gene expression values separately which amounts to two distinct reduced space representations). Third, in contrast to standard cluster models, observations are not assigned to a single cluster and, thus, for example, gene expression levels are modeled via combinations of the latent processes identified by the algorithm. We show this new method compares favorably with alternative cluster analysis methods. To illustrate its potential, we apply the proposed technique to several microarray data sets for cancer. For these data sets it successfully decomposes the data into known subtypes and indicates possible further taxonomic subdivision in addition to highlighting, in a wholly unsupervised manner, the importance of certain genes which are known to be medically significant. To illustrate its wider applicability, we also illustrate its performance on a microarray data set for yeast
- …