Search CORE

216 research outputs found

Fielded sequential dependence model for ad-hoc entity retrieval in the web of data

Author: Kotov A.
Nikolaev F.
Zhiltsov N.
Publication venue
Publication date: 01/01/2015
Field of study

Previously proposed approaches to ad-hoc entity retrieval in the Web of Data (ERWD) used multi-fielded representation of entities and relied on standard unigram bag-of-words retrieval models. Although retrieval models incorporating term dependencies have been shown to be significantly more effective than the unigram bag-of-words ones for ad hoc document retrieval, it is not known whether accounting for term dependencies can improve retrieval from the Web of Data. In this work, we propose a novel retrieval model that incorporates term dependencies into structured document retrieval and apply it to the task of ERWD. In the proposed model, the document field weights and the relative importance of unigrams and bigrams are optimized with respect to the target retrieval metric using a learning-to-rank method. Experiments on a publicly available benchmark indicate significant improvement of the accuracy of retrieval results by the proposed model over state-of-the-art retrieval models for ERWD

Kazan Federal University Digital Repository

Graph-Embedding Empowered Entity Retrieval

Author: D Metzler
DL Davies
K Balog
L McInnes
N Jardine
N Noy
PJ Rousseeuw
S Robertson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/05/2020
Field of study

In this research, we improve upon the current state of the art in entity retrieval by re-ranking the result list using graph embeddings. The paper shows that graph embeddings are useful for entity-oriented search tasks. We demonstrate empirically that encoding information from the knowledge graph into (graph) embeddings contributes to a higher increase in effectiveness of entity retrieval results than using plain word embeddings. We analyze the impact of the accuracy of the entity linker on the overall retrieval effectiveness. Our analysis further deploys the cluster hypothesis to explain the observed advantages of graph embeddings over the more widely used word embeddings, for user tasks involving ranking entities

arXiv.org e-Print Archive

Crossref

Why Does This Entity Matter? Finding Support Passages for Entities in Search

Author: Chatterjee Shubham
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 31/12/2020
Field of study

In this work, we propose a method to retrieve a human-readable explanation of how a retrieved entity is connected to the information need, analogous to search snippets for document retrieval. Such an explanation is called a support passage. Our approach is based on the idea: a good support passage contains many entities relevantly related to the target entity (the entity for which a support passage is needed). We define a relevantly related entity as one which (1) occurs frequently in the vicinity of the target entity, and (2) is relevant to the query. We use the relevance of a passage (induced by the relevantly related entities) to find a good support passage for the target entity. Moreover, we want the target entity to be central to the discussion in the support passage. Hence, we explore the utility of entity salience for support passage retrieval and study the conditions under which it can help. We show that our proposed method can improve performance as compared to the current state-of-the-art for support passage retrieval on two datasets from TREC Complex Answer Retrieval

UNH Scholars' Repository

Why Does This Entity Matter? Finding Support Passages for Entities in Search

Author: Chatterjee Shubham
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/12/2020
Field of study

UNH Scholars' Repository

Target Type Identification for Entity-Bearing Queries

Author: Balog Krisztian
Croft W. Bruce
Mikolov Tomas
Sawant Uma
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/07/2017
Field of study

Identifying the target types of entity-bearing queries can help improve retrieval performance as well as the overall search experience. In this work, we address the problem of automatically detecting the target types of a query with respect to a type taxonomy. We propose a supervised learning approach with a rich variety of features. Using a purpose-built test collection, we show that our approach outperforms existing methods by a remarkable margin. This is an extended version of the article published with the same title in the Proceedings of SIGIR'17.Comment: Extended version of SIGIR'17 short paper, 5 page

arXiv.org e-Print Archive

Crossref

Attentive neural architecture for ad-hoc structured document retrieval

Author: Balaneshinkordan S.
Kotov A.
Nikolaev F.
Publication venue
Publication date: 01/01/2018
Field of study

© 2018 Copyright held by the owner/author(s). Publication rights licensed to ACM. The problem of ad-hoc structured document retrieval arises in many information access scenarios, from Web to product search. Yet neither deep neural networks, which have been successfully applied to ad-hoc information retrieval and Web search, nor the attention mechanism, which has been shown to significantly improve the performance of deep neural networks on natural language processing tasks, have been explored in the context of this problem. In this paper, we propose a deep neural architecture for ad-hoc structured document retrieval, which utilizes attention mechanism to determine important phrases in keyword queries as well as the relative importance of matching those phrases in different fields of structured documents. Experimental evaluation on publicly available collections for Web document, product and entity retrieval from knowledge graphs indicates superior retrieval accuracy of the proposed neural architecture relative to both state-of-the-art neural architectures for ad-hoc document retrieval and probabilistic models for ad-hoc structured document retrieval

Kazan Federal University Digital Repository

Entity-Oriented Search

Author: Balog Krisztian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2021
Field of study

This open access book covers all facets of entity-oriented search—where “search” can be interpreted in the broadest sense of information access—from a unified point of view, and provides a coherent and comprehensive overview of the state of the art. It represents the first synthesis of research in this broad and rapidly developing area. Selected topics are discussed in-depth, the goal being to establish fundamental techniques and methods as a basis for future research and development. Additional topics are treated at a survey level only, containing numerous pointers to the relevant literature. A roadmap for future research, based on open issues and challenges identified along the way, rounds out the book. The book is divided into three main parts, sandwiched between introductory and concluding chapters. The first two chapters introduce readers to the basic concepts, provide an overview of entity-oriented search tasks, and present the various types and sources of data that will be used throughout the book. Part I deals with the core task of entity ranking: given a textual query, possibly enriched with additional elements or structural hints, return a ranked list of entities. This core task is examined in a number of different variants, using both structured and unstructured data collections, and numerous query formulations. In turn, Part II is devoted to the role of entities in bridging unstructured and structured data. Part III explores how entities can enable search engines to understand the concepts, meaning, and intent behind the query that the user enters into the search box, and how they can provide rich and focused responses (as opposed to merely a list of documents)—a process known as semantic search. The final chapter concludes the book by discussing the limitations of current approaches, and suggesting directions for future research. Researchers and graduate students are the primary target audience of this book. A general background in information retrieval is sufficient to follow the material, including an understanding of basic probability and statistics concepts as well as a basic knowledge of machine learning concepts and supervised learning algorithms

Directory of Open Access Books (DOAB)