2 research outputs found

    Processing and evaluating partial tree pattern queries on XML data

    No full text
    XML query languages typically allow the specification of structural patterns using XPath. Usually, these structural patterns are in the form of trees (Tree-Pattern Queries—TPQs). Finding the occurrences of such patterns in an XML tree is a key operation in XML query evaluation. The multiple previous algorithms presented for this operation focus mainly on the evaluation of tree-pattern queries. Recently, requirements for flexible querying of XML data have motivated the consideration of query classes that are more expressive and flexible than TPQs for which efficient nonmain-memory evaluation algorithms are not known. In this paper, we consider a class of queries, called Partial Tree-Pattern Queries (PTPQs), which generalize and strictly contain TPQs. PTPQs represent a broad fragment of XPath which is very useful in practice. In order to process PTPQs, we introduce a set of sound and complete inference rules to characterize structural relationship derivation. We provide necessary and sufficient conditions for detecting query unsatisfiability and node redundancy. We also show that PTPQs can be represented as directed acyclic graphs augmented with the “same-path” constraints. In order to leverage existing efficient evaluation algorithms for less expressive classes of queries, we design two approaches that evaluate a PTPQ by decomposing it into a set of simpler queries: algorithm IndexTPQGen, exploits a structural summary of the XML data and evaluates a PTPQ by generating an equivalent set of TPQs and unioning their answers. Algorithm PartialPathJoin decomposes the PTPQ into partial-path queries, and merge-joins their solutions. We also develop PartialTreeStack, an original polynomial time holistic algorithm for PTPQs. To the best of our knowledge, this is the first algorithm to support the evaluation of such a broad structural fragment of XPath in the inverted lists evaluation model. We provide a theoretical analysis of our algorithm and identify cases where it is asymptotically optimal. An extensive experimental evaluation shows that it is more efficient, robust, and stable than the other two and it outperforms a state-of-the art XQuery engine on PTPQs

    An experimental study and evaluation of a new architecture for clinical decision support - integrating the openEHR specifications for the Electronic Health Record with Bayesian Networks

    Get PDF
    Healthcare informatics still lacks wide-scale adoption of intelligent decision support methods, despite continuous increases in computing power and methodological advances in scalable computation and machine learning, over recent decades. The potential has long been recognised, as evidenced in the literature of the domain, which is extensively reviewed. The thesis identifies and explores key barriers to adoption of clinical decision support, through computational experiments encompassing a number of technical platforms. Building on previous research, it implements and tests a novel platform architecture capable of processing and reasoning with clinical data. The key components of this platform are the now widely implemented openEHR electronic health record specifications and Bayesian Belief Networks. Substantial software implementations are used to explore the integration of these components, guided and supplemented by input from clinician experts and using clinical data models derived in hospital settings at Moorfields Eye Hospital. Data quality and quantity issues are highlighted. Insights thus gained are used to design and build a novel graph-based representation and processing model for the clinical data, based on the openEHR specifications. The approach can be implemented using diverse modern database and platform technologies. Computational experiments with the platform, using data from two clinical domains – a preliminary study with published thyroid metabolism data and a substantial study of cataract surgery – explore fundamental barriers that must be overcome in intelligent healthcare systems developments for clinical settings. These have often been neglected, or misunderstood as implementation procedures of secondary importance. The results confirm that the methods developed have the potential to overcome a number of these barriers. The findings lead to proposals for improvements to the openEHR specifications, in the context of machine learning applications, and in particular for integrating them with Bayesian Networks. The thesis concludes with a roadmap for future research, building on progress and findings to date
    corecore