Search CORE

120 research outputs found

The State-of-the-arts in Focused Search

Author: Li Rongmei
Publication venue: University of Twente, Centre for Telematics and Information Technology
Publication date: 01/01/2009
Field of study

The continuous influx of various text data on the Web requires search engines to improve their retrieval abilities for more specific information. The need for relevant results to a user’s topic of interest has gone beyond search for domain or type specific documents to more focused result (e.g. document fragments or answers to a query). The introduction of XML provides a format standard for data representation, storage, and exchange. It helps focused search to be carried out at different granularities of a structured document with XML markups. This report aims at reviewing the state-of-the-arts in focused search, particularly techniques for topic-specific document retrieval, passage retrieval, XML retrieval, and entity ranking. It is concluded with highlight of open problems

University of Twente Research Information

Contextualization using hyperlinks and internal hierarchical structure of Wikipedia documents

Author: Arvola P.
Norozi M.A. (Muhammad)
Vries A.P. (Arjen) de
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2012
Field of study

CWI's Institutional Repository

The State-of-the-arts in Focused Search

Author
Publication venue: Centre for Telematics and Information Technology (CTIT)
Publication date: 08/07/2009
Field of study

University of Twente Research Information

Processing content-and-structure queries for XML retrieval

Author: de Rijke M.
Kamps J.
Sigurbjörnsson B.
Publication venue
Publication date: 01/01/2004
Field of study

International Migration, Integration and Social Cohesion online publications

Contextualization using hyperlinks and internal hierarchical structure of Wikipedia documents

Author: Arvola P.
Norozi Muhammad
Vries Arjen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

Crossref

CWI's Institutional Repository

Structural Features in XML Retrieval

Author: Ramirez Camps G. (Georgina)
Publication venue
Publication date: 02/11/2007
Field of study

CWI's Institutional Repository

Theoretical evaluation of XML retrieval

Author: Blanke Tobias
Publication venue
Publication date: 01/01/2011
Field of study

This thesis develops a theoretical framework to evaluate XML retrieval. XML retrieval deals with retrieving those document parts that specifically answer a query. It is concerned with using the document structure to improve the retrieval of information from documents by only delivering those parts of a document an information need is about. We define a theoretical evaluation methodology based on the idea of `aboutness' and apply it to XML retrieval models. Situation Theory is used to express the aboutness proprieties of XML retrieval models. We develop a dedicated methodology for the evaluation of XML retrieval and apply this methodology to five XML retrieval models and other XML retrieval topics such as evaluation methodologies, filters and experimental results

Glasgow Theses Service

Crossref

King's Research Portal

OpenGrey Repository

A survey on tree matching and XML retrieval

Author: Aho
Al-Khalifa
Alilaouar
Amer-Yahia
Aouicha
Ayala
Bille
Bille
Botev
Bruno
Buneman
Burghardt
Cai
Campi
Ceri
Chamberlin
Chase
Chen
Chen
Chen
Chen
Chen
Chen
Cheng
Cole
Cole
Cyril Laitang
Dalamagas
Dalamagas
Damiani
Damiani
Dao
de Vries
Demaine
Denoyer
Dubiner
Dulucq
Dürr
Hamamache Kheddouci
Haw
Haw
Hoffmann
Hubert
Hummel
Izadi
Jansson
Jiang
Jiang
Jiang
Kamps
Karen Pinel-Sauvagnat
Kazai
Kazai
Kilpelainen
Klein
Knuth
Kosaraju
Kuboyama
Laitang
Lalmas
Lalmas
Le
Lei Ning
Levenshtein
Levy
Li
Li
Li
Lu
Lu
Mass
Mihajlovic
Mohammed Amin Tahraoui
Mohand Boughanem
Ogilvie
Pehcevski
Pehcevski
Pinel-Sauvagnat
Piwowarski
Popovici
Qin
Rao
Richter
Robie
Runapongsa
Schenkel
Schenkel
Schlieder
Shasha
Stahl
Tai
Tekli
Theobald
Trotman
Trotman
Trotman
Trotman
Trotman
van Zwol
Wagner
Wang
Wang
Wang
Wang
Wu
Yang
Yao
Zezula
Zezula
Zhang
Zhang
Zhou
Publication venue: 'Elsevier BV'
Publication date: 01/05/2013
Field of study

International audienceWith the increasing number of available XML documents, numerous approaches for retrieval have been proposed in the literature. They usually use the tree representation of documents and queries to process them, whether in an implicit or explicit way. Although retrieving XML documents can be considered as a tree matching problem between the query tree and the document trees, only a few approaches take advantage of the algorithms and methods proposed by the graph theory. In this paper, we aim at studying the theoretical approaches proposed in the literature for tree matching and at seeing how these approaches have been adapted to XML querying and retrieval, from both an exact and an approximate matching perspective. This study will allow us to highlight theoretical aspects of graph theory that have not been yet explored in XML retrieval

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Hal - Université Grenoble Alpes

Open Archive Toulouse Archive Ouverte

HAL

Hal-Diderot

The Role of Context in Matching and Evaluation of XML Information Retrieval

Author: Arvola Paavo
Publication venue: Tampere University Press
Publication date: 01/01/1993
Field of study

Sähköisten kokoelmien kasvun, hakujen arkipäiväistymisen ja mobiililaitteiden yleistymisen myötä yksi tiedonhaun menetelmien kehittämisen tavoitteista on saavuttaa alati tarkempia hakutuloksia; pitkistäkin dokumenteista oleellinen sisältö pyritään osoittamaan hakijalle tarkasti. Tiedonhakija pyritään siis vapauttamaan turhasta dokumenttien selaamisesta. Internetissä ja muussa sähköisessä julkaisemisessa dokumenttien osat merkitään usein XML-kielen avulla dokumenttien automaattista käsittelyä varten. XML-merkkaus mahdollistaa dokumenttien sisäisen rakenteen hyödyntämisen. Toisin sanoen tätä merkkausta voidaan hyödyntää kehitettäessä tarkkuusorientoituneita (kohdennettuja) tiedonhakujärjestelmiä ja menetelmiä. Väitöskirja käsittelee tarkkuusorientoitunutta tiedonhakua, jossa eksplisiittistä XML merkkausta voidaan hyödyntää. Väitöskirjassa on kaksi pääteemaa, joista ensimmäisen käsittelee XML -tiedonhakujärjestelmä TRIX:in (Tampere Retrieval and Indexing for XML) kehittämistä, toteuttamista ja arviointia. Toinen teema käsittelee kohdennettujen tiedonhakujärjestelmien empiirisiä arviointimenetelmiä. Ensimmäisen teeman merkittävin kontribuutio on kontekstualisointi, jolloin täsmäytyksessä XML-tiedonhaulle tyypillistä tekstievidenssin vähäisyyttä kompensoidaan hyödyntämällä XML-hierarkian ylempien tai rinnakkaisten osien sisältöä (so. kontekstia). Menetelmän toimivuus osoitetaan empiirisin menetelmin. Tutkimuksen seurauksena kontekstualisointi (contextualization) on vakiintunut alan yleiseen, kansainväliseen sanastoon. Toisessa teemassa todetaan kohdennetun tiedonhaun vaikuttavuuden mittaamiseen käytettävien menetelmien olevan monin tavoin puutteellisia. Puutteiden korjaamiseksi väitöskirjassa kehitetään realistisempia arviointimenetelmiä, jotka ottavat huomioon palautettavien hakuyksiköiden kontekstin, lukemisjärjestyksen ja käyttäjälle selailusta koituvan vaivan. Tutkimuksessa kehitetty mittari (T2I(300)) on valittu varsinaiseksi mittariksi kansainvälisessä INEX (Initiative for the Evaluation of XML Retrieval) hankkeessa, joka on vuonna 2002 perustettu XML tiedonhaun tutkimusfoorumi.This dissertation addresses focused retrieval, especially its sub-concept XML (eXtensible Mark-up Language) information retrieval (XML IR). In XML IR, the retrievable units are either individual elements, or sets of elements grouped together typically by a document. These units are ranked according to their estimated relevance by an XML IR system. In traditional information retrieval, the retrievable unit is an atomic document. Due to this atomicity, many core characteristics of such document retrieval paradigm are not appropriate for XML IR. Of these characteristics, this dissertation explores element indexing, scoring and evaluation methods which form two main themes: 1. Element indexing, scoring, and contextualization 2. Focused retrieval evaluation To investigate the first theme, an XML IR system based on structural indices is constructed. The structural indices offer analyzing power for studying element hierarchies. The main finding in the system development is the utilization of surrounding elements as supplementary evidence in element scoring. This method is called contextualization, for which we distinguish three models: vertical, horizontal and ad hoc contextualizations. The models are tested with the tools provided by (or derived from) the Initiative for the Evaluation of XML retrieval (INEX). The results indicate that the evidence from element surroundings improves the scoring effectiveness of XML retrieval. The second theme entails a task where the retrievable elements are grouped by a document. The aim of this theme is to create methods measuring XML IR effectiveness in a credible fashion in a laboratory environment. The credibility is pursued by assuming the chronological reading order of a user together with a point where the user becomes frustrated after reading a certain amount of non-relevant material. Novel metrics are created based on these assumptions. The relative rankings of systems measured with the metrics differ from those delivered by contemporary metrics. In addition, the focused retrieval strategies benefit from the novel metrics over traditional full document retrieval

bepress Legal Repository

Trepo - Institutional Repository of Tampere University

Touro College: Digital Commons @ Touro Law Center

The First Twente Data Management Workshop (TDM'04) on XML Databases and Information Retrieval. Proceedings

Author
Publication venue: Centre for Telematics and Information Technology (CTIT)
Publication date: 01/06/2004
Field of study

University of Twente Research Information