66,424 research outputs found
Adversarial Sampling and Training for Semi-Supervised Information Retrieval
Ad-hoc retrieval models with implicit feedback often have problems, e.g., the
imbalanced classes in the data set. Too few clicked documents may hurt
generalization ability of the models, whereas too many non-clicked documents
may harm effectiveness of the models and efficiency of training. In addition,
recent neural network-based models are vulnerable to adversarial examples due
to the linear nature in them. To solve the problems at the same time, we
propose an adversarial sampling and training framework to learn ad-hoc
retrieval models with implicit feedback. Our key idea is (i) to augment clicked
examples by adversarial training for better generalization and (ii) to obtain
very informational non-clicked examples by adversarial sampling and training.
Experiments are performed on benchmark data sets for common ad-hoc retrieval
tasks such as Web search, item recommendation, and question answering.
Experimental results indicate that the proposed approaches significantly
outperform strong baselines especially for high-ranked documents, and they
outperform IRGAN in NDCG@5 using only 5% of labeled data for the Web search
task.Comment: Published in WWW 201
Exploration vs. Exploitation in the Information Filtering Problem
We consider information filtering, in which we face a stream of items too
voluminous to process by hand (e.g., scientific articles, blog posts, emails),
and must rely on a computer system to automatically filter out irrelevant
items. Such systems face the exploration vs. exploitation tradeoff, in which it
may be beneficial to present an item despite a low probability of relevance,
just to learn about future items with similar content. We present a Bayesian
sequential decision-making model of this problem, show how it may be solved to
optimality using a decomposition to a collection of two-armed bandit problems,
and show structural results for the optimal policy. We show that the resulting
method is especially useful when facing the cold start problem, i.e., when
filtering items for new users without a long history of past interactions. We
then present an application of this information filtering method to a
historical dataset from the arXiv.org repository of scientific articles.Comment: 36 pages, 5 figure
Intent-Aware Contextual Recommendation System
Recommender systems take inputs from user history, use an internal ranking
algorithm to generate results and possibly optimize this ranking based on
feedback. However, often the recommender system is unaware of the actual intent
of the user and simply provides recommendations dynamically without properly
understanding the thought process of the user. An intelligent recommender
system is not only useful for the user but also for businesses which want to
learn the tendencies of their users. Finding out tendencies or intents of a
user is a difficult problem to solve.
Keeping this in mind, we sought out to create an intelligent system which
will keep track of the user's activity on a web-application as well as
determine the intent of the user in each session. We devised a way to encode
the user's activity through the sessions. Then, we have represented the
information seen by the user in a high dimensional format which is reduced to
lower dimensions using tensor factorization techniques. The aspect of intent
awareness (or scoring) is dealt with at this stage. Finally, combining the user
activity data with the contextual information gives the recommendation score.
The final recommendations are then ranked using filtering and collaborative
recommendation techniques to show the top-k recommendations to the user. A
provision for feedback is also envisioned in the current system which informs
the model to update the various weights in the recommender system. Our overall
model aims to combine both frequency-based and context-based recommendation
systems and quantify the intent of a user to provide better recommendations.
We ran experiments on real-world timestamped user activity data, in the
setting of recommending reports to the users of a business analytics tool and
the results are better than the baselines. We also tuned certain aspects of our
model to arrive at optimized results.Comment: Presented at the 5th International Workshop on Data Science and Big
Data Analytics (DSBDA), 17th IEEE International Conference on Data Mining
(ICDM) 2017; 8 pages; 4 figures; Due to the limitation "The abstract field
cannot be longer than 1,920 characters," the abstract appearing here is
slightly shorter than the one in the PDF fil
Neural Collaborative Ranking
Recommender systems are aimed at generating a personalized ranked list of
items that an end user might be interested in. With the unprecedented success
of deep learning in computer vision and speech recognition, recently it has
been a hot topic to bridge the gap between recommender systems and deep neural
network. And deep learning methods have been shown to achieve state-of-the-art
on many recommendation tasks. For example, a recent model, NeuMF, first
projects users and items into some shared low-dimensional latent feature space,
and then employs neural nets to model the interaction between the user and item
latent features to obtain state-of-the-art performance on the recommendation
tasks. NeuMF assumes that the non-interacted items are inherent negative and
uses negative sampling to relax this assumption. In this paper, we examine an
alternative approach which does not assume that the non-interacted items are
necessarily negative, just that they are less preferred than interacted items.
Specifically, we develop a new classification strategy based on the widely used
pairwise ranking assumption. We combine our classification strategy with the
recently proposed neural collaborative filtering framework, and propose a
general collaborative ranking framework called Neural Network based
Collaborative Ranking (NCR). We resort to a neural network architecture to
model a user's pairwise preference between items, with the belief that neural
network will effectively capture the latent structure of latent factors. The
experimental results on two real-world datasets show the superior performance
of our models in comparison with several state-of-the-art approaches.Comment: Proceedings of the 2018 ACM on Conference on Information and
Knowledge Managemen
On the Impact of Entity Linking in Microblog Real-Time Filtering
Microblogging is a model of content sharing in which the temporal locality of
posts with respect to important events, either of foreseeable or unforeseeable
nature, makes applica- tions of real-time filtering of great practical
interest. We propose the use of Entity Linking (EL) in order to improve the
retrieval effectiveness, by enriching the representation of microblog posts and
filtering queries. EL is the process of recognizing in an unstructured text the
mention of relevant entities described in a knowledge base. EL of short pieces
of text is a difficult task, but it is also a scenario in which the information
EL adds to the text can have a substantial impact on the retrieval process. We
implement a start-of-the-art filtering method, based on the best systems from
the TREC Microblog track realtime adhoc retrieval and filtering tasks , and
extend it with a Wikipedia-based EL method. Results show that the use of EL
significantly improves over non-EL based versions of the filtering methods.Comment: 6 pages, 1 figure, 1 table. SAC 2015, Salamanca, Spain - April 13 -
17, 201
Recommended from our members
Beyond TREC's filtering track
Following the withdrawal of the filtering track from the latest TREC conferences, there is a niche for new evaluation standards. Towards this end, we suggest, based on variations of TREC's routing subtask, two new evaluation methodologies. The first can be used for evaluating single, multi-topic profiles and the second for testing the ability of a multi-topic profile to adapt to both modest variations and radical drifts in user interests
- …