Search CORE

2 research outputs found

Recommended from our members

Okapi-based XML indexing

Author: Lu W.
MacFarlane A.
Venuti F.
Publication venue: Emerald
Publication date: 18/09/2009
Field of study

Purpose – Being an important data exchange and information storage standard, XML has generated a great deal of interest and particular attention has been paid to the issue of XML indexing. Clear use cases for structured search in XML have been established. However, most of the research in the area is either based on relational database systems or specialized semi‐structured data management systems. This paper aims to propose a method for XML indexing based on the information retrieval (IR) system Okapi. Design/methodology/approach – First, the paper reviews the structure of inverted files and gives an overview of the issues of why this indexing mechanism cannot properly support XML retrieval, using the underlying data structures of Okapi as an example. Then the paper explores a revised method implemented on Okapi using path indexing structures. The paper evaluates these index structures through the metrics of indexing run time, path search run time and space costs using the INEX and Reuters RVC1 collections. Findings – Initial results on the INEX collections show that there is a substantial overhead in space costs for the method, but this increase does not affect run time adversely. Indexing results on differing sized Reuters RVC1 sub‐collections show that the increase in space costs with increasing the size of a collection is significant, but in terms of run time the increase is linear. Path search results show sub‐millisecond run times, demonstrating minimal overhead for XML search. Practical implications – Overall, the results show the method implemented to support XML search in a traditional IR system such as Okapi is viable. Originality/value – The paper provides useful information on a method for XML indexing based on the IR system Okapi

City Research Online

What XML-IR Users May Want

Author: Edwards Sylvia
Geva Shlomo
Woodley Alan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

It is assumed that by focusing on retrieval at a granularity lower than documents that XML-IR systems will better satisfy users’ information need than traditional IR systems. Participates in INEX’s Ad-hoc track develop XMLIR systems based upon this assumption, using an evaluation methodology in the tradition of Cranfield. However, since the inception of INEX, debate has raged on how applicable some of the Ad-hoc tasks are to real user tasks. The purpose of the User-Case Studies track from to explore the application of XML-IR systems from the users’ perspective. This paper outlines QUT’s involvement in this task. For our involvement we conducted a user experiment using an XMLIR system (GPX) and three interfaces: a standard keyword interface, a natural language interface (NLPX) and a query-by-template interface (Bricks). Following the experiment we interviewed the users about their experience and asked them - in comparison with a traditional XML-IR system - what type of tasks would they use an XML-IR system for, what extra information they would need to interact with an XML-IR system and how would they want to see XML-IR results presented. It is hoped that the outcomes of this study will bring us closer to understanding what users want from XML-IR systems

CiteSeerX

Queensland University of Technology ePrints Archive

University of Queensland eSpace