8 research outputs found
Digital Rule of Thumb: A Natural Experiment on Autocomplete in Search Engines
Search engines are an essential part of our lives. However, we do not fully understand what affects users\u27 search inputs. One of the most notable features affecting search inputs is autocomplete, an intelligent agent suggesting queries while typing. Understanding the impact of autocomplete helps eCommerce companies retain customers; examining its impact is difficult since all search engines have adopted it, and experiments are risky for firms. We overcome the challenges by leveraging a novel natural experiment of an eCommerce company. Our preliminary results suggest that the deactivation of autocomplete for the incorrect keyword led to a substantial drop in website visits in the PC channel compared to the mobile channel. In addition, website visits substantially shifted from the incorrect keyword to the correct keyword in the mobile channel but not in the PC environment. This short paper is expected to shed new light on our understanding of autocomplete\u27s impact
Efficient Neural Query Auto Completion
Query Auto Completion (QAC), as the starting point of information retrieval
tasks, is critical to user experience. Generally it has two steps: generating
completed query candidates according to query prefixes, and ranking them based
on extracted features. Three major challenges are observed for a query auto
completion system: (1) QAC has a strict online latency requirement. For each
keystroke, results must be returned within tens of milliseconds, which poses a
significant challenge in designing sophisticated language models for it. (2)
For unseen queries, generated candidates are of poor quality as contextual
information is not fully utilized. (3) Traditional QAC systems heavily rely on
handcrafted features such as the query candidate frequency in search logs,
lacking sufficient semantic understanding of the candidate.
In this paper, we propose an efficient neural QAC system with effective
context modeling to overcome these challenges. On the candidate generation
side, this system uses as much information as possible in unseen prefixes to
generate relevant candidates, increasing the recall by a large margin. On the
candidate ranking side, an unnormalized language model is proposed, which
effectively captures deep semantics of queries. This approach presents better
ranking performance over state-of-the-art neural ranking methods and reduces
95\% latency compared to neural language modeling methods. The empirical
results on public datasets show that our model achieves a good balance between
accuracy and efficiency. This system is served in LinkedIn job search with
significant product impact observed.Comment: Accepted at CIKM 202
An Eye-tracking Study of User Interactions with Query Auto Completion
Query Auto Completion (QAC) suggests possible queries to web search users from the moment they start entering a query. This popular feature of web search engines is thought to reduce physical and cognitive effort when formulating a query. Perhaps surprisingly, despite QAC being widely used, users ’ inter-actions with it are poorly understood. This paper begins to address this gap. We present the results of an in-depth user study of user interactions with QAC in web search. While study participants completed web search tasks, we recorded their interactions using eye-tracking and client-side logging. This allows us to provide a first look at how users interact with QAC. We specifically focus on the effects of QAC ranking, by controlling the quality of the ranking in a within-subject design. We identify a strong position bias that is consistent across ranking conditions. Due to this strong position bias, ranking quality affects QAC usage. We also find an effect on task completion, in particular on the number of result pages visited. We show how these effects can be explained by a combination of searchers ’ behavior patterns, namely monitoring or ignoring QAC, and searching for spelling support or complete queries to express a search intent. We conclude the paper with a discussion of the important implications of our findings for QAC evaluation
Neural Methods for Effective, Efficient, and Exposure-Aware Information Retrieval
Neural networks with deep architectures have demonstrated significant
performance improvements in computer vision, speech recognition, and natural
language processing. The challenges in information retrieval (IR), however, are
different from these other application areas. A common form of IR involves
ranking of documents--or short passages--in response to keyword-based queries.
Effective IR systems must deal with query-document vocabulary mismatch problem,
by modeling relationships between different query and document terms and how
they indicate relevance. Models should also consider lexical matches when the
query contains rare terms--such as a person's name or a product model
number--not seen during training, and to avoid retrieving semantically related
but irrelevant results. In many real-life IR tasks, the retrieval involves
extremely large collections--such as the document index of a commercial Web
search engine--containing billions of documents. Efficient IR methods should
take advantage of specialized IR data structures, such as inverted index, to
efficiently retrieve from large collections. Given an information need, the IR
system also mediates how much exposure an information artifact receives by
deciding whether it should be displayed, and where it should be positioned,
among other results. Exposure-aware IR systems may optimize for additional
objectives, besides relevance, such as parity of exposure for retrieved items
and content publishers. In this thesis, we present novel neural architectures
and methods motivated by the specific needs and challenges of IR tasks.Comment: PhD thesis, Univ College London (2020
EYE-AS-AN-INPUT FOR IMPROVING INTERACTIVE INFORMATION RETRIEVAL
In this work, Publication Access Through Tiered Interaction and Exploration (PATTIE) is presented with the eye as an additional input modality. PATTIE is built upon the scatter/gather information retrieval paradigm where users can explore a visual and interactive table-of-contents metaphor for large-scale document collections in an iterative manner. Additionally, the prototype has been integrated with eye-tracking through the web camera and experimental findings are provided to demonstrate a proof-of-concept for interest modeling at the term level and implicit relevance feedback on the gold standard inaugural 2019 Text REtrieval Conference Precision Medicine dataset (TREC PM). Low error rates for gaze tracking, and acceptable performance on binary classification of interest are reported as well as statistically significant increases in precision and recall performance for relevant information on a TREC PM task when PATTIE is used with eye-as-an-input versus a baseline PATTIE system.Doctor of Philosoph
Users, Queries, and Bad Abandonment in Web Search
After a user submits a query and receives a list of search results, the user may abandon their query without clicking on any of the search results. A bad query abandonment is when a searcher abandons the SERP because they were dissatisfied with the quality of the search results, often making the user reformulate their query in the hope of receiving better search results. As we move closer to understanding when and what causes a user to abandon their query under different qualities of search results, we move forward in an overall understanding of user behavior with search engines. In this thesis, we describe three user studies to investigate bad query abandonment.
First, we report on a study to investigate the rate and time at which users abandon their queries at different levels of search quality. We had users search for answers to questions, but showed users manipulated SERPs that contain one relevant document placed at different ranks. We show that as the quality of search results decreases, the probability of abandonment increases, and that users quickly decide to abandon their queries. Users make their decisions fast, but not all users are the same. We show that there appear to be two types of users that behave differently, with one group more likely to abandon their query and are quicker in finding answers than the group less likely to abandon their query.
Second, we describe an eye-tracking experiment that focuses on understanding possible causes of users' willingness to examine SERPs and what motivates users to continue or discontinue their examination. Using eye-tracking data, we found that a user deciding to abandon a query is best understood by the user's examination pattern not including a relevant search result. If a user sees a relevant result, they are very likely to click it. However, users' examination of results are different and may be influenced by other factors. The key factors we found are the rank of search results, the user type, and the query quality. For example, we show that regardless of where the relevant document is placed in the SERP, the type of query submitted affects examination, and if a user enters an ambiguous query, they are likely to examine fewer results.
Third, we show how the nature of non-relevant material affects users' willingness to further explore a ranked list of search results. We constructed and showed participants manipulated SERPs with different types of non-relevant documents. We found that user examination of search results and time to query abandonment is influenced by the coherence and type of non-relevant documents included in the SERP. For SERPs coherent on off-topic results, users spend the least amount of time before abandoning and are less likely to request to view more results. The time they spend increases as the SERP quality improves, and users are more likely to request to view more results when the SERP contains diversified non-relevant results on multiple subtopics