86,009 research outputs found
Deep Character-Level Click-Through Rate Prediction for Sponsored Search
Predicting the click-through rate of an advertisement is a critical component
of online advertising platforms. In sponsored search, the click-through rate
estimates the probability that a displayed advertisement is clicked by a user
after she submits a query to the search engine. Commercial search engines
typically rely on machine learning models trained with a large number of
features to make such predictions. This is inevitably requires a lot of
engineering efforts to define, compute, and select the appropriate features. In
this paper, we propose two novel approaches (one working at character level and
the other working at word level) that use deep convolutional neural networks to
predict the click-through rate of a query-advertisement pair. Specially, the
proposed architectures only consider the textual content appearing in a
query-advertisement pair as input, and produce as output a click-through rate
prediction. By comparing the character-level model with the word-level model,
we show that language representation can be learnt from scratch at character
level when trained on enough data. Through extensive experiments using billions
of query-advertisement pairs of a popular commercial search engine, we
demonstrate that both approaches significantly outperform a baseline model
built on well-selected text features and a state-of-the-art word2vec-based
approach. Finally, by combining the predictions of the deep models introduced
in this study with the prediction of the model in production of the same
commercial search engine, we significantly improve the accuracy and the
calibration of the click-through rate prediction of the production system.Comment: SIGIR2017, 10 page
How much should you ask? On the question structure in QA systems
Datasets that boosted state-of-the-art solutions for Question Answering (QA)
systems prove that it is possible to ask questions in natural language manner.
However, users are still used to query-like systems where they type in keywords
to search for answer. In this study we validate which parts of questions are
essential for obtaining valid answer. In order to conclude that, we take
advantage of LIME - a framework that explains prediction by local
approximation. We find that grammar and natural language is disregarded by QA.
State-of-the-art model can answer properly even if 'asked' only with a few
words with high coefficients calculated with LIME. According to our knowledge,
it is the first time that QA model is being explained by LIME.Comment: Accepted to Analyzing and interpreting neural networks for NLP
workshop at EMNLP 201
Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search
Cross Language Information Retrieval
(CLIR) systems are a valuable tool to enable speakers of one language to search for
content of interest expressed in a different
language. A group for whom this is of particular interest is bilingual Arabic speakers
who wish to search for English language
content using information needs expressed
in Arabic queries. A key challenge in
CLIR is crossing the language barrier
between the query and the documents.
The most common approach to bridging
this gap is automated query translation,
which can be unreliable for vague or short
queries. In this work, we examine the
potential for improving CLIR effectiveness
by predicting the translation effectiveness
using Query Performance Prediction (QPP)
techniques. We propose a novel QPP
method to estimate the quality of translation for an Arabic-Engish Cross-lingual
User-generated Speech Search (CLUGS)
task. We present an empirical evaluation
that demonstrates the quality of our method
on alternative translation outputs extracted
from an Arabic-to-English Machine Translation system developed for this task. Finally, we show how this framework can be
integrated in CLUGS to find relevant translations for improved retrieval performance
Semantic Structure based Query Graph Prediction for Question Answering over Knowledge Graph
Building query graphs from questions is an important step in complex question answering over knowledge graph (Complex KGQA). In general, a question can be correctly answered if its query graph is built correctly and the right answer is then retrieved by issuing the query graph against the KG. Therefore, this paper focuses on query graph generation from natural language questions. Existing approaches for query graph generation ignore the semantic structure of a question, resulting in a large number of noisy query graph candidates that undermine prediction accuracies. In this paper, we define six semantic structures from common questions in KGQA and develop a novel Structure-BERT to predict the semantic structure of a question, and then rank the remaining candidates with a BERT-based ranking model. Extensive experiments on two popular benchmarks MetaQA and WebQuestionsSP demonstrate the effectiveness of our method as compared to state-of-the-arts
Learning Interpretable Queries for Explainable Image Classification with Information Pursuit
Information Pursuit (IP) is an explainable prediction algorithm that greedily
selects a sequence of interpretable queries about the data in order of
information gain, updating its posterior at each step based on observed
query-answer pairs. The standard paradigm uses hand-crafted dictionaries of
potential data queries curated by a domain expert or a large language model
after a human prompt. However, in practice, hand-crafted dictionaries are
limited by the expertise of the curator and the heuristics of prompt
engineering. This paper introduces a novel approach: learning a dictionary of
interpretable queries directly from the dataset. Our query dictionary learning
problem is formulated as an optimization problem by augmenting IP's variational
formulation with learnable dictionary parameters. To formulate learnable and
interpretable queries, we leverage the latent space of large vision and
language models like CLIP. To solve the optimization problem, we propose a new
query dictionary learning algorithm inspired by classical sparse dictionary
learning. Our experiments demonstrate that learned dictionaries significantly
outperform hand-crafted dictionaries generated with large language models
- …