19,478 research outputs found
Adversarial Sampling and Training for Semi-Supervised Information Retrieval
Ad-hoc retrieval models with implicit feedback often have problems, e.g., the
imbalanced classes in the data set. Too few clicked documents may hurt
generalization ability of the models, whereas too many non-clicked documents
may harm effectiveness of the models and efficiency of training. In addition,
recent neural network-based models are vulnerable to adversarial examples due
to the linear nature in them. To solve the problems at the same time, we
propose an adversarial sampling and training framework to learn ad-hoc
retrieval models with implicit feedback. Our key idea is (i) to augment clicked
examples by adversarial training for better generalization and (ii) to obtain
very informational non-clicked examples by adversarial sampling and training.
Experiments are performed on benchmark data sets for common ad-hoc retrieval
tasks such as Web search, item recommendation, and question answering.
Experimental results indicate that the proposed approaches significantly
outperform strong baselines especially for high-ranked documents, and they
outperform IRGAN in NDCG@5 using only 5% of labeled data for the Web search
task.Comment: Published in WWW 201
End-to-end Learning for Short Text Expansion
Effectively making sense of short texts is a critical task for many real
world applications such as search engines, social media services, and
recommender systems. The task is particularly challenging as a short text
contains very sparse information, often too sparse for a machine learning
algorithm to pick up useful signals. A common practice for analyzing short text
is to first expand it with external information, which is usually harvested
from a large collection of longer texts. In literature, short text expansion
has been done with all kinds of heuristics. We propose an end-to-end solution
that automatically learns how to expand short text to optimize a given learning
task. A novel deep memory network is proposed to automatically find relevant
information from a collection of longer documents and reformulate the short
text through a gating mechanism. Using short text classification as a
demonstrating task, we show that the deep memory network significantly
outperforms classical text expansion methods with comprehensive experiments on
real world data sets.Comment: KDD'201
Towards Question-based Recommender Systems
Conversational and question-based recommender systems have gained increasing
attention in recent years, with users enabled to converse with the system and
better control recommendations. Nevertheless, research in the field is still
limited, compared to traditional recommender systems. In this work, we propose
a novel Question-based recommendation method, Qrec, to assist users to find
items interactively, by answering automatically constructed and algorithmically
chosen questions. Previous conversational recommender systems ask users to
express their preferences over items or item facets. Our model, instead, asks
users to express their preferences over descriptive item features. The model is
first trained offline by a novel matrix factorization algorithm, and then
iteratively updates the user and item latent factors online by a closed-form
solution based on the user answers. Meanwhile, our model infers the underlying
user belief and preferences over items to learn an optimal question-asking
strategy by using Generalized Binary Search, so as to ask a sequence of
questions to the user. Our experimental results demonstrate that our proposed
matrix factorization model outperforms the traditional Probabilistic Matrix
Factorization model. Further, our proposed Qrec model can greatly improve the
performance of state-of-the-art baselines, and it is also effective in the case
of cold-start user and item recommendations.Comment: accepted by SIGIR 202
User Intent Prediction in Information-seeking Conversations
Conversational assistants are being progressively adopted by the general
population. However, they are not capable of handling complicated
information-seeking tasks that involve multiple turns of information exchange.
Due to the limited communication bandwidth in conversational search, it is
important for conversational assistants to accurately detect and predict user
intent in information-seeking conversations. In this paper, we investigate two
aspects of user intent prediction in an information-seeking setting. First, we
extract features based on the content, structural, and sentiment
characteristics of a given utterance, and use classic machine learning methods
to perform user intent prediction. We then conduct an in-depth feature
importance analysis to identify key features in this prediction task. We find
that structural features contribute most to the prediction performance. Given
this finding, we construct neural classifiers to incorporate context
information and achieve better performance without feature engineering. Our
findings can provide insights into the important factors and effective methods
of user intent prediction in information-seeking conversations.Comment: Accepted to CHIIR 201
Identifying Unclear Questions in Community Question Answering Websites
Thousands of complex natural language questions are submitted to community
question answering websites on a daily basis, rendering them as one of the most
important information sources these days. However, oftentimes submitted
questions are unclear and cannot be answered without further clarification
questions by expert community members. This study is the first to investigate
the complex task of classifying a question as clear or unclear, i.e., if it
requires further clarification. We construct a novel dataset and propose a
classification approach that is based on the notion of similar questions. This
approach is compared to state-of-the-art text classification baselines. Our
main finding is that the similar questions approach is a viable alternative
that can be used as a stepping stone towards the development of supportive user
interfaces for question formulation.Comment: Proceedings of the 41th European Conference on Information Retrieval
(ECIR '19), 201
- …