3,248 research outputs found
Auditing Search Engines for Differential Satisfaction Across Demographics
Many online services, such as search engines, social media platforms, and
digital marketplaces, are advertised as being available to any user, regardless
of their age, gender, or other demographic factors. However, there are growing
concerns that these services may systematically underserve some groups of
users. In this paper, we present a framework for internally auditing such
services for differences in user satisfaction across demographic groups, using
search engines as a case study. We first explain the pitfalls of na\"ively
comparing the behavioral metrics that are commonly used to evaluate search
engines. We then propose three methods for measuring latent differences in user
satisfaction from observed differences in evaluation metrics. To develop these
methods, we drew on ideas from the causal inference literature and the
multilevel modeling literature. Our framework is broadly applicable to other
online services, and provides general insight into interpreting their
evaluation metrics.Comment: 8 pages Accepted at WWW 201
An interactive two-dimensional approach to query aspects rewriting in systematic reviews. IMS unipd at CLEF eHealth task 2
International audienc
Sequence to Sequence Learning for Query Expansion
Using sequence to sequence algorithms for query expansion has not been
explored yet in Information Retrieval literature nor in Question-Answering's.
We tried to fill this gap in the literature with a custom Query Expansion
engine trained and tested on open datasets. Starting from open datasets, we
built a Query Expansion training set using sentence-embeddings-based Keyword
Extraction. We therefore assessed the ability of the Sequence to Sequence
neural networks to capture expanding relations in the words embeddings' space.Comment: 8 pages, 2 figures, AAAI-19 Student Abstract and Poster Progra
Utilizing sub-topical structure of documents for information retrieval.
Text segmentation in natural language processing typically refers to the process of decomposing a document into constituent subtopics. Our work centers on the application of text segmentation techniques within information retrieval (IR) tasks. For example, for scoring a document by combining the retrieval scores of its constituent segments, exploiting the proximity of query terms in documents for ad-hoc search, and for question answering (QA), where retrieved passages from multiple documents are aggregated and presented as a single document to a searcher. Feedback in ad hoc IR task is shown to beneïŹt from the use of extracted sentences instead of terms from the pseudo relevant documents for query expansion. Retrieval effectiveness for patent prior art search task is enhanced by applying text segmentation to the patent queries. Another aspect of our work involves augmenting text segmentation techniques to produce segments which are more readable with less unresolved anaphora. This is particularly useful for QA and snippet generation tasks where the objective is to aggregate relevant and novel information from multiple documents satisfying user information need on one hand, and ensuring that the automatically generated content presented to the user is easily readable without reference to the original source document
Learning to Attend, Copy, and Generate for Session-Based Query Suggestion
Users try to articulate their complex information needs during search
sessions by reformulating their queries. To make this process more effective,
search engines provide related queries to help users in specifying the
information need in their search process. In this paper, we propose a
customized sequence-to-sequence model for session-based query suggestion. In
our model, we employ a query-aware attention mechanism to capture the structure
of the session context. is enables us to control the scope of the session from
which we infer the suggested next query, which helps not only handle the noisy
data but also automatically detect session boundaries. Furthermore, we observe
that, based on the user query reformulation behavior, within a single session a
large portion of query terms is retained from the previously submitted queries
and consists of mostly infrequent or unseen terms that are usually not included
in the vocabulary. We therefore empower the decoder of our model to access the
source words from the session context during decoding by incorporating a copy
mechanism. Moreover, we propose evaluation metrics to assess the quality of the
generative models for query suggestion. We conduct an extensive set of
experiments and analysis. e results suggest that our model outperforms the
baselines both in terms of the generating queries and scoring candidate queries
for the task of query suggestion.Comment: Accepted to be published at The 26th ACM International Conference on
Information and Knowledge Management (CIKM2017
A Hierarchical Recurrent Encoder-Decoder For Generative Context-Aware Query Suggestion
Users may strive to formulate an adequate textual query for their information
need. Search engines assist the users by presenting query suggestions. To
preserve the original search intent, suggestions should be context-aware and
account for the previous queries issued by the user. Achieving context
awareness is challenging due to data sparsity. We present a probabilistic
suggestion model that is able to account for sequences of previous queries of
arbitrary lengths. Our novel hierarchical recurrent encoder-decoder
architecture allows the model to be sensitive to the order of queries in the
context while avoiding data sparsity. Additionally, our model can suggest for
rare, or long-tail, queries. The produced suggestions are synthetic and are
sampled one word at a time, using computationally cheap decoding techniques.
This is in contrast to current synthetic suggestion models relying upon machine
learning pipelines and hand-engineered feature sets. Results show that it
outperforms existing context-aware approaches in a next query prediction
setting. In addition to query suggestion, our model is general enough to be
used in a variety of other applications.Comment: To appear in Conference of Information Knowledge and Management
(CIKM) 201
- âŠ