24,580 research outputs found
The application of user log for online business environment using content-based Image retrieval system
Over the past few years, inter-query learning has gained much attention in the research and development of content-based image retrieval (CBIR) systems. This is largely due to the capability of inter-query approach to enable learning from the retrieval patterns of previous query sessions. However, much of the research works in this field have been focusing on analyzing image retrieval patterns stored in the database. This is not suitable for a dynamic environment such as the World Wide Web (WWW) where images are constantly added or removed. A better alternative is to use an image's visual features to capture the knowledge gained from the previous query sessions. Based on the previous work (Chung et al., 2006), the aim of this paper is to propose a framework of inter-query learning for the WWW-CBIR systems. Such framework can be extremely useful for those online companies whose core business involves providing multimedia content-based services and products to their customers
Neural Vector Spaces for Unsupervised Information Retrieval
We propose the Neural Vector Space Model (NVSM), a method that learns
representations of documents in an unsupervised manner for news article
retrieval. In the NVSM paradigm, we learn low-dimensional representations of
words and documents from scratch using gradient descent and rank documents
according to their similarity with query representations that are composed from
word representations. We show that NVSM performs better at document ranking
than existing latent semantic vector space methods. The addition of NVSM to a
mixture of lexical language models and a state-of-the-art baseline vector space
model yields a statistically significant increase in retrieval effectiveness.
Consequently, NVSM adds a complementary relevance signal. Next to semantic
matching, we find that NVSM performs well in cases where lexical matching is
needed.
NVSM learns a notion of term specificity directly from the document
collection without feature engineering. We also show that NVSM learns
regularities related to Luhn significance. Finally, we give advice on how to
deploy NVSM in situations where model selection (e.g., cross-validation) is
infeasible. We find that an unsupervised ensemble of multiple models trained
with different hyperparameter values performs better than a single
cross-validated model. Therefore, NVSM can safely be used for ranking documents
without supervised relevance judgments.Comment: TOIS 201
- …