2,397 research outputs found
Hashing based Answer Selection
Answer selection is an important subtask of question answering (QA), where
deep models usually achieve better performance. Most deep models adopt
question-answer interaction mechanisms, such as attention, to get vector
representations for answers. When these interaction based deep models are
deployed for online prediction, the representations of all answers need to be
recalculated for each question. This procedure is time-consuming for deep
models with complex encoders like BERT which usually have better accuracy than
simple encoders. One possible solution is to store the matrix representation
(encoder output) of each answer in memory to avoid recalculation. But this will
bring large memory cost. In this paper, we propose a novel method, called
hashing based answer selection (HAS), to tackle this problem. HAS adopts a
hashing strategy to learn a binary matrix representation for each answer, which
can dramatically reduce the memory cost for storing the matrix representations
of answers. Hence, HAS can adopt complex encoders like BERT in the model, but
the online prediction of HAS is still fast with a low memory cost. Experimental
results on three popular answer selection datasets show that HAS can outperform
existing models to achieve state-of-the-art performance
Content Based Document Recommender using Deep Learning
With the recent advancements in information technology there has been a huge
surge in amount of data available. But information retrieval technology has not
been able to keep up with this pace of information generation resulting in over
spending of time for retrieving relevant information. Even though systems exist
for assisting users to search a database along with filtering and recommending
relevant information, but recommendation system which uses content of documents
for recommendation still have a long way to mature. Here we present a Deep
Learning based supervised approach to recommend similar documents based on the
similarity of content. We combine the C-DSSM model with Word2Vec distributed
representations of words to create a novel model to classify a document pair as
relevant/irrelavant by assigning a score to it. Using our model retrieval of
documents can be done in O(1) time and the memory complexity is O(n), where n
is number of documents.Comment: Accepted in ICICI 2017, Coimbatore, Indi
Adversarial Sampling and Training for Semi-Supervised Information Retrieval
Ad-hoc retrieval models with implicit feedback often have problems, e.g., the
imbalanced classes in the data set. Too few clicked documents may hurt
generalization ability of the models, whereas too many non-clicked documents
may harm effectiveness of the models and efficiency of training. In addition,
recent neural network-based models are vulnerable to adversarial examples due
to the linear nature in them. To solve the problems at the same time, we
propose an adversarial sampling and training framework to learn ad-hoc
retrieval models with implicit feedback. Our key idea is (i) to augment clicked
examples by adversarial training for better generalization and (ii) to obtain
very informational non-clicked examples by adversarial sampling and training.
Experiments are performed on benchmark data sets for common ad-hoc retrieval
tasks such as Web search, item recommendation, and question answering.
Experimental results indicate that the proposed approaches significantly
outperform strong baselines especially for high-ranked documents, and they
outperform IRGAN in NDCG@5 using only 5% of labeled data for the Web search
task.Comment: Published in WWW 201
- …