Skip to main content
Article thumbnail
Location of Repository

Efficient Algorithms for Context Query Evaluation over a Tagged Corpus

By Jérémy Barbay and Alejandro López-ortiz

Abstract

Abstract—We present an optimal adaptive algorithm for context queries in tagged content. The queries consist of locating instances of a tag within a context specified by the query using patterns with preorder, ancestor-descendant and proximity operators in the document tree implied by the tagged content. The time taken to resolve a query Q on a document tree T is logarithmic in the size of T, proportional to the size of Q, and to the difficulty of the combination of Q with T, as measured by the minimal size of a certificate of the answer. The performance of the algorithm is no worse than the classical worst-case optimal, while provably better on simpler queries and corpora. More formally, the algorithm runs in time O(δk lg(n/δk)) in the standard RAM model and in time O(δk lg lg min(n, σ)) in the Θ(lg(n))-word RAM model, where k is the number of edges in the query, δ is the minimum number of operations required to certify the answer to the query, n is the number of nodes in the tree, and σ is the number of labels indexed. I

Year: 2013
OAI identifier: oai:CiteSeerX.psu:10.1.1.352.3693
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://users.dcc.uchile.cl/~jb... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.