527 research outputs found

    Semantic spaces revisited: investigating the performance of auto-annotation and semantic retrieval using semantic spaces

    No full text
    Semantic spaces encode similarity relationships between objects as a function of position in a mathematical space. This paper discusses three different formulations for building semantic spaces which allow the automatic-annotation and semantic retrieval of images. The models discussed in this paper require that the image content be described in the form of a series of visual-terms, rather than as a continuous feature-vector. The paper also discusses how these term-based models compare to the latest state-of-the-art continuous feature models for auto-annotation and retrieval

    Integrating Document Clustering and Topic Modeling

    Full text link
    Document clustering and topic modeling are two closely related tasks which can mutually benefit each other. Topic modeling can project documents into a topic space which facilitates effective document clustering. Cluster labels discovered by document clustering can be incorporated into topic models to extract local topics specific to each cluster and global topics shared by all clusters. In this paper, we propose a multi-grain clustering topic model (MGCTM) which integrates document clustering and topic modeling into a unified framework and jointly performs the two tasks to achieve the overall best performance. Our model tightly couples two components: a mixture component used for discovering latent groups in document collection and a topic model component used for mining multi-grain topics including local topics specific to each cluster and global topics shared across clusters.We employ variational inference to approximate the posterior of hidden variables and learn model parameters. Experiments on two datasets demonstrate the effectiveness of our model.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013

    Measuring concept similarities in multimedia ontologies: analysis and evaluations

    Get PDF
    The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing

    Multiple Statistical Analysis Techniques Corroborate Intratumor Heterogeneity in Imaging Mass Spectrometry Datasets of Myxofibrosarcoma

    Get PDF
    MALDI mass spectrometry can generate profiles that contain hundreds of biomolecular ions directly from tissue. Spatially-correlated analysis, MALDI imaging MS, can simultaneously reveal how each of these biomolecular ions varies in clinical tissue samples. The use of statistical data analysis tools to identify regions containing correlated mass spectrometry profiles is referred to as imaging MS-based molecular histology because of its ability to annotate tissues solely on the basis of the imaging MS data. Several reports have indicated that imaging MS-based molecular histology may be able to complement established histological and histochemical techniques by distinguishing between pathologies with overlapping/identical morphologies and revealing biomolecular intratumor heterogeneity. A data analysis pipeline that identifies regions of imaging MS datasets with correlated mass spectrometry profiles could lead to the development of novel methods for improved diagnosis (differentiating subgroups within distinct histological groups) and annotating the spatio-chemical makeup of tumors. Here it is demonstrated that highlighting the regions within imaging MS datasets whose mass spectrometry profiles were found to be correlated by five independent multivariate methods provides a consistently accurate summary of the spatio-chemical heterogeneity. The corroboration provided by using multiple multivariate methods, efficiently applied in an automated routine, provides assurance that the identified regions are indeed characterized by distinct mass spectrometry profiles, a crucial requirement for its development as a complementary histological tool. When simultaneously applied to imaging MS datasets from multiple patient samples of intermediate-grade myxofibrosarcoma, a heterogeneous soft tissue sarcoma, nodules with mass spectrometry profiles found to be distinct by five different multivariate methods were detected within morphologically identical regions of all patient tissue samples. To aid the further development of imaging MS based molecular histology as a complementary histological tool the Matlab code of the agreement analysis, instructions and a reduced dataset are included as supporting information

    Transforming Graph Representations for Statistical Relational Learning

    Full text link
    Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed
    corecore