84 research outputs found
A Vertical PRF Architecture for Microblog Search
In microblog retrieval, query expansion can be essential to obtain good
search results due to the short size of queries and posts. Since information in
microblogs is highly dynamic, an up-to-date index coupled with pseudo-relevance
feedback (PRF) with an external corpus has a higher chance of retrieving more
relevant documents and improving ranking. In this paper, we focus on the
research question:how can we reduce the query expansion computational cost
while maintaining the same retrieval precision as standard PRF? Therefore, we
propose to accelerate the query expansion step of pseudo-relevance feedback.
The hypothesis is that using an expansion corpus organized into verticals for
expanding the query, will lead to a more efficient query expansion process and
improved retrieval effectiveness. Thus, the proposed query expansion method
uses a distributed search architecture and resource selection algorithms to
provide an efficient query expansion process. Experiments on the TREC Microblog
datasets show that the proposed approach can match or outperform standard PRF
in MAP and NDCG@30, with a computational cost that is three orders of magnitude
lower.Comment: To appear in ICTIR 201
Modeling Temporal Evidence from External Collections
Newsworthy events are broadcast through multiple mediums and prompt the
crowds to produce comments on social media. In this paper, we propose to
leverage on this behavioral dynamics to estimate the most relevant time periods
for an event (i.e., query). Recent advances have shown how to improve the
estimation of the temporal relevance of such topics. In this approach, we build
on two major novelties. First, we mine temporal evidences from hundreds of
external sources into topic-based external collections to improve the
robustness of the detection of relevant time periods. Second, we propose a
formal retrieval model that generalizes the use of the temporal dimension
across different aspects of the retrieval process. In particular, we show that
temporal evidence of external collections can be used to (i) infer a topic's
temporal relevance, (ii) select the query expansion terms, and (iii) re-rank
the final results for improved precision. Experiments with TREC Microblog
collections show that the proposed time-aware retrieval model makes an
effective and extensive use of the temporal dimension to improve search results
over the most recent temporal models. Interestingly, we observe a strong
correlation between precision and the temporal distribution of retrieved and
relevant documents.Comment: To appear in WSDM 201
Characterizing and Predicting Email Deferral Behavior
Email triage involves going through unhandled emails and deciding what to do
with them. This familiar process can become increasingly challenging as the
number of unhandled email grows. During a triage session, users commonly defer
handling emails that they cannot immediately deal with to later. These deferred
emails, are often related to tasks that are postponed until the user has more
time or the right information to deal with them. In this paper, through
qualitative interviews and a large-scale log analysis, we study when and what
enterprise email users tend to defer. We found that users are more likely to
defer emails when handling them involves replying, reading carefully, or
clicking on links and attachments. We also learned that the decision to defer
emails depends on many factors such as user's workload and the importance of
the sender. Our qualitative results suggested that deferring is very common,
and our quantitative log analysis confirms that 12% of triage sessions and 16%
of daily active users had at least one deferred email on weekdays. We also
discuss several deferral strategies such as marking emails as unread and
flagging that are reported by our interviewees, and illustrate how such
patterns can be also observed in user logs. Inspired by the characteristics of
deferred emails and contextual factors involved in deciding if an email should
be deferred, we train a classifier for predicting whether a recently triaged
email is actually deferred. Our experimental results suggests that deferral can
be classified with modest effectiveness. Overall, our work provides novel
insights about how users handle their emails and how deferral can be modeled
"One-size-fits-all"? Observations and Expectations of NLG Systems Across Identity-Related Language Features
Fairness-related assumptions about what constitutes appropriate NLG system
behaviors range from invariance, where systems are expected to respond
identically to social groups, to adaptation, where responses should instead
vary across them. We design and conduct five case studies, in which we perturb
different types of identity-related language features (names, roles, locations,
dialect, and style) in NLG system inputs to illuminate tensions around
invariance and adaptation. We outline people's expectations of system
behaviors, and surface potential caveats of these two contrasting yet
commonly-held assumptions. We find that motivations for adaptation include
social norms, cultural differences, feature-specific information, and
accommodation; motivations for invariance include perspectives that favor
prescriptivism, view adaptation as unnecessary or too difficult for NLG systems
to do appropriately, and are wary of false assumptions. Our findings highlight
open challenges around defining what constitutes fair NLG system behavior.Comment: 36 pages, 24 figure
PREME: Preference-based Meeting Exploration through an Interactive Questionnaire
The recent increase in the volume of online meetings necessitates automated
tools for managing and organizing the material, especially when an attendee has
missed the discussion and needs assistance in quickly exploring it. In this
work, we propose a novel end-to-end framework for generating interactive
questionnaires for preference-based meeting exploration. As a result, users are
supplied with a list of suggested questions reflecting their preferences. Since
the task is new, we introduce an automatic evaluation strategy. Namely, it
measures how much the generated questions via questionnaire are answerable to
ensure factual correctness and covers the source meeting for the depth of
possible exploration
- …