Search CORE

1,051 research outputs found

The State-of-the-arts in Focused Search

Author: Li Rongmei
Publication venue: University of Twente, Centre for Telematics and Information Technology
Publication date: 01/01/2009
Field of study

The continuous influx of various text data on the Web requires search engines to improve their retrieval abilities for more specific information. The need for relevant results to a user’s topic of interest has gone beyond search for domain or type specific documents to more focused result (e.g. document fragments or answers to a query). The introduction of XML provides a format standard for data representation, storage, and exchange. It helps focused search to be carried out at different granularities of a structured document with XML markups. This report aims at reviewing the state-of-the-arts in focused search, particularly techniques for topic-specific document retrieval, passage retrieval, XML retrieval, and entity ranking. It is concluded with highlight of open problems

University of Twente Research Information

PFTijah: text search in an XML database system

Author: Flokstra J.
Hiemstra D.
Os R. van
Rode H.
Publication venue: Ecole Nationale Supérieure des Mines de Saint-Etienne
Publication date: 01/01/2006
Field of study

This paper introduces the PFTijah system, a text search system that is integrated with an XML/XQuery database management system. We present examples of its use, we explain some of the system internals, and discuss plans for future work. PFTijah is part of the open source release of MonetDB/XQuery

CiteSeerX

Radboud Repository

University of Twente Research Information

Seven years of INEX interactive retrieval experiments – lessons and challenges

Author: Nordlie Ragnar
Pharo Nils
Publication venue: Springer Verlag
Publication date: 01/01/2012
Field of study

This paper summarizes a major effort in interactive search investigation, the INEX i-track, a collective effort run over a seven-year period. We present the experimental conditions, report some of the findings of the participating groups, and examine the challenges posed by this kind of collective experimental effort

NORA - Norwegian Open Research Archives

Open Digital Archive at Oslo and Akershus University College

Sound ranking algorithms for XML search

Author: Apers P.M.G.
Flokstra J.
Hiemstra D.
Klinger S.
Rode H.
Publication venue: University of Otago
Publication date: 01/01/2008
Field of study

Ranking algorithms for XML should reflect the actual combined content and structure constraints of queries, while at the same time producing equal rankings for queries that are semantically equal. Ranking algorithms that produce different rankings for queries that are semantically equal are easily detected by tests on large databases: We call such algorithms not sound. We report the behavior of different approaches to ranking content-and-structure queries on pairs of queries for which we expect equal ranking results from the query semantics. We show that most of these approaches are not sound. Of the remaining approaches, only 3 adhere to the W3C XQuery Full-Text standard

KOPS - The Institutional Repository of the University of Konstanz

CiteSeerX

University of Twente Research Information

Entity Ranking in Wikipedia

Author: Pehcevski Jovan
Thom James A.
Vercoustre Anne-Marie
Publication venue
Publication date: 01/01/2007
Field of study

The traditional entity extraction problem lies in the ability of extracting named entities from plain text using natural language processing techniques and intensive training from large document collections. Examples of named entities include organisations, people, locations, or dates. There are many research activities involving named entities; we are interested in entity ranking in the field of information retrieval. In this paper, we describe our approach to identifying and ranking entities from the INEX Wikipedia document collection. Wikipedia offers a number of interesting features for entity identification and ranking that we first introduce. We then describe the principles and the architecture of our entity ranking system, and introduce our methodology for evaluation. Our preliminary results show that the use of categories and the link structure of Wikipedia, together with entity examples, can significantly improve retrieval effectiveness.Comment: to appea

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

RMIT Research Repository

A survey on tree matching and XML retrieval

Author: Aho
Al-Khalifa
Alilaouar
Amer-Yahia
Aouicha
Ayala
Bille
Bille
Botev
Bruno
Buneman
Burghardt
Cai
Campi
Ceri
Chamberlin
Chase
Chen
Chen
Chen
Chen
Chen
Chen
Cheng
Cole
Cole
Cyril Laitang
Dalamagas
Dalamagas
Damiani
Damiani
Dao
de Vries
Demaine
Denoyer
Dubiner
Dulucq
Dürr
Hamamache Kheddouci
Haw
Haw
Hoffmann
Hubert
Hummel
Izadi
Jansson
Jiang
Jiang
Jiang
Kamps
Karen Pinel-Sauvagnat
Kazai
Kazai
Kilpelainen
Klein
Knuth
Kosaraju
Kuboyama
Laitang
Lalmas
Lalmas
Le
Lei Ning
Levenshtein
Levy
Li
Li
Li
Lu
Lu
Mass
Mihajlovic
Mohammed Amin Tahraoui
Mohand Boughanem
Ogilvie
Pehcevski
Pehcevski
Pinel-Sauvagnat
Piwowarski
Popovici
Qin
Rao
Richter
Robie
Runapongsa
Schenkel
Schenkel
Schlieder
Shasha
Stahl
Tai
Tekli
Theobald
Trotman
Trotman
Trotman
Trotman
Trotman
van Zwol
Wagner
Wang
Wang
Wang
Wang
Wu
Yang
Yao
Zezula
Zezula
Zhang
Zhang
Zhou
Publication venue: 'Elsevier BV'
Publication date: 01/05/2013
Field of study

International audienceWith the increasing number of available XML documents, numerous approaches for retrieval have been proposed in the literature. They usually use the tree representation of documents and queries to process them, whether in an implicit or explicit way. Although retrieving XML documents can be considered as a tree matching problem between the query tree and the document trees, only a few approaches take advantage of the algorithms and methods proposed by the graph theory. In this paper, we aim at studying the theoretical approaches proposed in the literature for tree matching and at seeing how these approaches have been adapted to XML querying and retrieval, from both an exact and an approximate matching perspective. This study will allow us to highlight theoretical aspects of graph theory that have not been yet explored in XML retrieval

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Hal - Université Grenoble Alpes

Open Archive Toulouse Archive Ouverte

Hal-Diderot

Vague element selection and query rewriting for XML retrieval

Author: Blok H.E.
Hiemstra Djoerd
Mihajlovic V.
Publication venue: Neslia Paniculata
Publication date: 01/01/2006
Field of study

In this paper we present the extension of our prototype three-level database system (TIJAH) developed for struc-tured information retrieval. The extension is aimed at mod-eling vague search on XML elements. All three levels (con-ceptual, logical, and physical) of the TIJAH system are enhanced to support vague search concepts. The vague search is implemented as vague selection of XML elements using XML element name expansion lists and rewriting tech-niques. We test the performance of retrieval models us-ing automatically generated expansion lists and compared them with models that use manual ones. The goal is to find the best approach for structured information retrieval with vague structural constraints on element names expressed in the query. 1

CiteSeerX

Radboud Repository

University of Twente Research Information

Investigating the document structure as a source of evidence for multimedia fragment retrieval

Author: Boughanem Mohand
Pinel-Sauvagnat Karen
Torjmen-Khemakhem Mouna
Publication venue: 'Elsevier BV'
Publication date: 01/11/2013
Field of study

International audienceMultimedia objects can be retrieved using their context that can be for instance the text surrounding them in documents. This text may be either near or far from the searched objects. Our goal in this paper is to study the impact, in term of effectiveness, of text position relatively to searched objects. The multimedia objects we consider are described in structured documents such as XML ones. The document structure is therefore exploited to provide this text position in documents. Although structural information has been shown to be an effective source of evidence in textual information retrieval, only a few works investigated its interest in multimedia retrieval. More precisely, the task we are interested in this paper is to retrieve multimedia fragments (i.e. XML elements having at least one multimedia object). Our general approach is built on two steps: we first retrieve XML elements containing multimedia objects, and we then explore the surrounding information to retrieve relevant multimedia fragments. In both cases, we study the impact of the surrounding information using the documents structure.Our work is carried out on images, but it can be extended to any other media, since the physical content of multimedia objects is not used. We conducted several experiments in the context of the Multimedia track of the INEX evaluation campaign. Results showed that structural evidences are of high interest to tune the importance of textual context for multimedia retrieval. Moreover, the proposed approach outperforms state of the art approaches

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL Descartes

The State-of-the-arts in Focused Search

Author
Publication venue: Centre for Telematics and Information Technology (CTIT)
Publication date: 08/07/2009
Field of study

University of Twente Research Information

Neural Architecture for Question Answering Using a Knowledge Graph and Web Corpus

Author: Sawant Uma
Garg Saurabh
Chakrabarti Soumen
Ramakrishnan Ganesh
Publication venue
Publication date: 06/12/2018
Field of study

In Web search, entity-seeking queries often trigger a special Question Answering (QA) system. It may use a parser to interpret the question to a structured query, execute that on a knowledge graph (KG), and return direct entity responses. QA systems based on precise parsing tend to be brittle: minor syntax variations may dramatically change the response. Moreover, KG coverage is patchy. At the other extreme, a large corpus may provide broader coverage, but in an unstructured, unreliable form. We present AQQUCN, a QA system that gracefully combines KG and corpus evidence. AQQUCN accepts a broad spectrum of query syntax, between well-formed questions to short `telegraphic' keyword sequences. In the face of inherent query ambiguities, AQQUCN aggregates signals from KGs and large corpora to directly rank KG entities, rather than commit to one semantic interpretation of the query. AQQUCN models the ideal interpretation as an unobservable or latent variable. Interpretations and candidate entity responses are scored as pairs, by combining signals from multiple convolutional networks that operate collectively on the query, KG and corpus. On four public query workloads, amounting to over 8,000 queries with diverse query syntax, we see 5--16% absolute improvement in mean average precision (MAP), compared to the entity ranking performance of recent systems. Our system is also competitive at entity set retrieval, almost doubling F1 scores for challenging short queries.Comment: Accepted to Information Retrieval Journa

arXiv.org e-Print Archive

Biblioteca Digital de la Comunidad de Madrid