5,441 research outputs found
Multiple Models for Recommending Temporal Aspects of Entities
Entity aspect recommendation is an emerging task in semantic search that
helps users discover serendipitous and prominent information with respect to an
entity, of which salience (e.g., popularity) is the most important factor in
previous work. However, entity aspects are temporally dynamic and often driven
by events happening over time. For such cases, aspect suggestion based solely
on salience features can give unsatisfactory results, for two reasons. First,
salience is often accumulated over a long time period and does not account for
recency. Second, many aspects related to an event entity are strongly
time-dependent. In this paper, we study the task of temporal aspect
recommendation for a given entity, which aims at recommending the most relevant
aspects and takes into account time in order to improve search experience. We
propose a novel event-centric ensemble ranking method that learns from multiple
time and type-dependent models and dynamically trades off salience and recency
characteristics. Through extensive experiments on real-world query logs, we
demonstrate that our method is robust and achieves better effectiveness than
competitive baselines.Comment: In proceedings of the 15th Extended Semantic Web Conference (ESWC
2018
Efficient Neural Query Auto Completion
Query Auto Completion (QAC), as the starting point of information retrieval
tasks, is critical to user experience. Generally it has two steps: generating
completed query candidates according to query prefixes, and ranking them based
on extracted features. Three major challenges are observed for a query auto
completion system: (1) QAC has a strict online latency requirement. For each
keystroke, results must be returned within tens of milliseconds, which poses a
significant challenge in designing sophisticated language models for it. (2)
For unseen queries, generated candidates are of poor quality as contextual
information is not fully utilized. (3) Traditional QAC systems heavily rely on
handcrafted features such as the query candidate frequency in search logs,
lacking sufficient semantic understanding of the candidate.
In this paper, we propose an efficient neural QAC system with effective
context modeling to overcome these challenges. On the candidate generation
side, this system uses as much information as possible in unseen prefixes to
generate relevant candidates, increasing the recall by a large margin. On the
candidate ranking side, an unnormalized language model is proposed, which
effectively captures deep semantics of queries. This approach presents better
ranking performance over state-of-the-art neural ranking methods and reduces
95\% latency compared to neural language modeling methods. The empirical
results on public datasets show that our model achieves a good balance between
accuracy and efficiency. This system is served in LinkedIn job search with
significant product impact observed.Comment: Accepted at CIKM 202
Exploratory Browsing
In recent years the digital media has influenced many areas of our life. The transition from analogue to digital has substantially changed our ways of dealing with media collections. Today‟s interfaces for managing digital media mainly offer fixed linear models corresponding to the underlying technical concepts (folders, events, albums, etc.), or the metaphors borrowed from the analogue counterparts (e.g., stacks, film rolls). However, people‟s mental interpretations of their media collections often go beyond the scope of linear scan. Besides explicit search with specific goals, current interfaces can not sufficiently support the explorative and often non-linear behavior. This dissertation presents an exploration of interface design to enhance the browsing experience with media collections. The main outcome of this thesis is a new model of Exploratory Browsing to guide the design of interfaces to support the full range of browsing activities, especially the Exploratory Browsing.
We define Exploratory Browsing as the behavior when the user is uncertain about her or his targets and needs to discover areas of interest (exploratory), in which she or he can explore in detail and possibly find some acceptable items (browsing). According to the browsing objectives, we group browsing activities into three categories: Search Browsing, General Purpose Browsing and Serendipitous Browsing. In the context of this thesis, Exploratory Browsing refers to the latter two browsing activities, which goes beyond explicit search with specific objectives.
We systematically explore the design space of interfaces to support the Exploratory Browsing experience. Applying the methodology of User-Centered Design, we develop eight prototypes, covering two main usage contexts of browsing with personal collections and in online communities.
The main studied media types are photographs and music.
The main contribution of this thesis lies in deepening the understanding of how people‟s exploratory behavior has an impact on the interface design. This thesis contributes to the field of interface design for media collections in several aspects. With the goal to inform the interface design to support the Exploratory Browsing experience with media collections, we present a model of Exploratory Browsing, covering the full range of exploratory activities around media collections. We investigate this model in different usage contexts and develop eight prototypes. The substantial implications gathered during the development and evaluation of these prototypes inform the further refinement of our model: We uncover the underlying transitional relations between browsing activities and discover several stimulators to encourage a fluid and effective activity transition. Based on this model, we propose a catalogue of general interface characteristics, and employ this catalogue as criteria to analyze the effectiveness of our prototypes. We also present several general suggestions for designing interfaces for media collections
Neural Methods for Effective, Efficient, and Exposure-Aware Information Retrieval
Neural networks with deep architectures have demonstrated significant
performance improvements in computer vision, speech recognition, and natural
language processing. The challenges in information retrieval (IR), however, are
different from these other application areas. A common form of IR involves
ranking of documents--or short passages--in response to keyword-based queries.
Effective IR systems must deal with query-document vocabulary mismatch problem,
by modeling relationships between different query and document terms and how
they indicate relevance. Models should also consider lexical matches when the
query contains rare terms--such as a person's name or a product model
number--not seen during training, and to avoid retrieving semantically related
but irrelevant results. In many real-life IR tasks, the retrieval involves
extremely large collections--such as the document index of a commercial Web
search engine--containing billions of documents. Efficient IR methods should
take advantage of specialized IR data structures, such as inverted index, to
efficiently retrieve from large collections. Given an information need, the IR
system also mediates how much exposure an information artifact receives by
deciding whether it should be displayed, and where it should be positioned,
among other results. Exposure-aware IR systems may optimize for additional
objectives, besides relevance, such as parity of exposure for retrieved items
and content publishers. In this thesis, we present novel neural architectures
and methods motivated by the specific needs and challenges of IR tasks.Comment: PhD thesis, Univ College London (2020
Entity-Oriented Search
This open access book covers all facets of entity-oriented search—where “search” can be interpreted in the broadest sense of information access—from a unified point of view, and provides a coherent and comprehensive overview of the state of the art. It represents the first synthesis of research in this broad and rapidly developing area. Selected topics are discussed in-depth, the goal being to establish fundamental techniques and methods as a basis for future research and development. Additional topics are treated at a survey level only, containing numerous pointers to the relevant literature. A roadmap for future research, based on open issues and challenges identified along the way, rounds out the book. The book is divided into three main parts, sandwiched between introductory and concluding chapters. The first two chapters introduce readers to the basic concepts, provide an overview of entity-oriented search tasks, and present the various types and sources of data that will be used throughout the book. Part I deals with the core task of entity ranking: given a textual query, possibly enriched with additional elements or structural hints, return a ranked list of entities. This core task is examined in a number of different variants, using both structured and unstructured data collections, and numerous query formulations. In turn, Part II is devoted to the role of entities in bridging unstructured and structured data. Part III explores how entities can enable search engines to understand the concepts, meaning, and intent behind the query that the user enters into the search box, and how they can provide rich and focused responses (as opposed to merely a list of documents)—a process known as semantic search. The final chapter concludes the book by discussing the limitations of current approaches, and suggesting directions for future research. Researchers and graduate students are the primary target audience of this book. A general background in information retrieval is sufficient to follow the material, including an understanding of basic probability and statistics concepts as well as a basic knowledge of machine learning concepts and supervised learning algorithms
- …