This paper describes a rst prototype system for content-based retrieval from XML data. The system's design supports both XPath queries and complex information retrieval queries based on a language modelling approach to information retrieval. Evaluation using the INEX benchmark shows that it is beneficial if the system is biased to retrieve large XML fragments over small fragments

Hiemstra, D.

English

Contains fulltext :
                  228362.pdf (preprint version ) (Open Access)NEX 200

Radboud Repository

A Database Approach to Content-based XML Retrieval

Hiemstra, Djoerd

University of Twente Research Information

A Database Approach to Content-based XML retrieval

This paper describes a first prototype system for contentbased retrieval from XML data. The system&apos;s design supports both XPath queries and complex information retrieval queries based on a language modelling approach to information retrieval. Evaluation using the INEX benchmark shows that it is beneficial if the system is biased to retrieve large XML fragments over small fragments

Djoerd Hiemstra

CiteSeerX

A database approach to content-based XML retrieval

NARCIS 

A general language model for information retrieval.

A hidden Markov model information retrieval system.

A linguistically motivated probabilistic model of information retrieval.

A performance evaluation of alternative mapping schemes for storing XML data in a relational database.

A probabilistic justi for using tf.idf term weighting in information retrieval.

A tutorial on hidden Markov models and selected applications in speech recognition.

Accelerating XPath location steps.

Content and Multimedia Database Management Systems.

CWI at INEX.

Database Optimization Aspects for Information Retrieval.

Disambiguation strategies for cross-language information retrieval.

Ecient Relational Storage and Retrieval of XML Documents.

Evaluating a probabilistic model for cross-lingual information retrieval.

Extended Boolean information retrieval.

Generating vector spaces on-they for exible XML retrieval.

Information retrieval as statistical translation.

Integrating Structured Data and Text: A Relational Approach.

Language Models and Structured Document Retrieval.

Moa: extensibility and eciency in querying nested data.

Modelling and searching web-based document collections.

On relevance, probabilistic indexing and information retrieval.

Soft evaluation of Boolean search queries in information retrieval systems.

Term-speci smoothing for the language modeling approach to information retrieval: The importance of a query term.

Term-weighting approaches in automatic text retrieval.

Using language models for information retrieval.

XIRQL: A query language for information retrieval in XML.

XML Path language 2.0. Technical report, World Wide Web Consortium,

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.8.6815

A Database Approach to Content-based XML retrieval

Abstract

Similar works

Full text

Available Versions

Radboud Repository

University of Twente Research Information

CiteSeerX

NARCIS