44,238 research outputs found
Probabilistic latent semantic analysis as a potential method for integrating spatial data concepts
In this paper we explore the use of Probabilistic Latent Semantic Analysis (PLSA) as a method for quantifying semantic differences between land cover classes. The results are promising, revealing ‘hidden’ or not easily discernible data concepts. PLSA provides a ‘bottom up’ approach to interoperability problems for users in the face of ‘top down’ solutions provided by formal ontologies. We note the potential for a meta-problem of how to interpret the concepts and the need for further research to reconcile the top-down and bottom-up approaches
Multi-modal joint embedding for fashion product retrieval
© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Finding a product in the fashion world can be a daunting task. Everyday, e-commerce sites are updating with thousands of images and their associated metadata (textual information), deepening the problem, akin to finding a needle in a haystack. In this paper, we leverage both the images and textual meta-data and propose a joint multi-modal embedding that maps both the text and images into a common latent space. Distances in the latent space correspond to similarity between products, allowing us to effectively perform retrieval in this latent space, which is both efficient and accurate. We train this embedding using large-scale real world e-commerce data by both minimizing the similarity between related products and using auxiliary classification networks to that encourage the embedding to have semantic meaning. We compare against existing approaches and show significant improvements in retrieval tasks on a large-scale e-commerce dataset. We also provide an analysis of the different metadata.Peer ReviewedPostprint (author's final draft
MetaLDA: a Topic Model that Efficiently Incorporates Meta information
Besides the text content, documents and their associated words usually come
with rich sets of meta informa- tion, such as categories of documents and
semantic/syntactic features of words, like those encoded in word embeddings.
Incorporating such meta information directly into the generative process of
topic models can improve modelling accuracy and topic quality, especially in
the case where the word-occurrence information in the training data is
insufficient. In this paper, we present a topic model, called MetaLDA, which is
able to leverage either document or word meta information, or both of them
jointly. With two data argumentation techniques, we can derive an efficient
Gibbs sampling algorithm, which benefits from the fully local conjugacy of the
model. Moreover, the algorithm is favoured by the sparsity of the meta
information. Extensive experiments on several real world datasets demonstrate
that our model achieves comparable or improved performance in terms of both
perplexity and topic quality, particularly in handling sparse texts. In
addition, compared with other models using meta information, our model runs
significantly faster.Comment: To appear in ICDM 201
- …