Search CORE

37,783 research outputs found

Development of Use Cases, Part I

Author: Bolzer Oliver
Bry François
Furche Tim
Kraus Sebastian
Schaffert Sebastian
Publication venue
Publication date: 06/03/2004
Field of study

For determining requirements and constructs appropriate for a Web query language, or in fact any language, use cases are of essence. The W3C has published two sets of use cases for XML and RDF query languages. In this article, solutions for these use cases are presented using Xcerpt. a novel Web and Semantic Web query language that combines access to standard Web data such as XML documents with access to Semantic Web metadata such as RDF resource descriptions with reasoning abilities and rules familiar from logicprogramming. To the best knowledge of the authors, this is the first in depth study of how to solve use cases for accessing XML and RDF in a single language: Integrated access to data and metadata has been recognized by industry and academia as one of the key challenges in data processing for the next decade. This article is a contribution towards addressing this challenge by demonstrating along practical and recognized use cases the usefulness of reasoning abilities, rules, and semistructured query languages for accessing both data (XML) and metadata (RDF)

Open Access LMU

Development of Use Cases, Part I

Author: Bolzer Oliver
Bry François
Furche Tim
Kraus Sebastian
Schaffert Sebastian
Publication venue
Publication date: 06/03/2004
Field of study

A Query Integrator and Manager for the Query Web

Author: Brinkley James F.
Detwiler Landon T.
Publication venue
Publication date: 01/04/2012
Field of study

We introduce two concepts: the Query Web as a layer of interconnected queries over the document web and the semantic web, and a Query Web Integrator and Manager (QI) that enables the Query Web to evolve. QI permits users to write, save and reuse queries over any web accessible source, including other queries saved in other installations of QI. The saved queries may be in any language (e.g. SPARQL, XQuery); the only condition for interconnection is that the queries return their results in some form of XML. This condition allows queries to chain off each other, and to be written in whatever language is appropriate for the task. We illustrate the potential use of QI for several biomedical use cases, including ontology view generation using a combination of graph-based and logical approaches, value set generation for clinical data management, image annotation using terminology obtained from an ontology web service, ontology-driven brain imaging data integration, small-scale clinical data integration, and wider-scale clinical data integration. Such use cases illustrate the current range of applications of QI and lead us to speculate about the potential evolution from smaller groups of interconnected queries into a larger query network that layers over the document and semantic web. The resulting Query Web could greatly aid researchers and others who now have to manually navigate through multiple information sources in order to answer specific questions

Elsevier - Publisher Connector

University of Washington Structural Informatics Group Publications

Answering Queries using Views over Probabilistic XML: Complexity and Tractability

Author: Cautis Bogdan
Kharlamov Evgeny
Publication venue
Publication date: 01/01/2012
Field of study

We study the complexity of query answering using views in a probabilistic XML setting, identifying large classes of XPath queries -- with child and descendant navigation and predicates -- for which there are efficient (PTime) algorithms. We consider this problem under the two possible semantics for XML query results: with persistent node identifiers and in their absence. Accordingly, we consider rewritings that can exploit a single view, by means of compensation, and rewritings that can use multiple views, by means of intersection. Since in a probabilistic setting queries return answers with probabilities, the problem of rewriting goes beyond the classic one of retrieving XML answers from views. For both semantics of XML queries, we show that, even when XML answers can be retrieved from views, their probabilities may not be computable. For rewritings that use only compensation, we describe a PTime decision procedure, based on easily verifiable criteria that distinguish between the feasible cases -- when probabilistic XML results are computable -- and the unfeasible ones. For rewritings that can use multiple views, with compensation and intersection, we identify the most permissive conditions that make probabilistic rewriting feasible, and we describe an algorithm that is sound in general, and becomes complete under fairly permissive restrictions, running in PTime modulo worst-case exponential time equivalence tests. This is the best we can hope for since intersection makes query equivalence intractable already over deterministic data. Our algorithm runs in PTime whenever deterministic rewritings can be found in PTime.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Oxford University Research Archive

Extraction of Web Information Using W4F Wrapper Factory and XML-QL Query Language

Author: Bhandari Deepali
Publication venue: ScholarlyCommons
Publication date: 01/08/1999
Field of study

In many ways, the Web has become the largest knowledge base known to us. The problem facing the user now is not that the information he seeks is not available, but that it is not easy for him to extract exactly what he needs from what is available. It is also becoming clear that a top down approach of gathering all the information, and structuring it will not work, except in some special cases. Indeed, most of the information is present in HTML documents structured only for visual content. Instead, new tools are being developed that attack this problem from a different angle. XML is a language that allows the publisher of the data to structure it using markup tags. These mark-up tags clarify not only the visual structure of the document, but also the semantic structure. Additionally, one can make use of a query language XML-QL to query XML pages for information, and to merge information from disparate XML sources. However, most of the content of the web is published in HTML. The W4F system allows us to construct wrappers that retrieve web pages, extract desired information using the HTML structure and regular expression search and map it automatically to XML with its XML-Gateway feature. In this thesis, we investigate the W4F/XML-QL paradigm to query the web. Two examples are presented. The first is the Internet Movie Database, and we query it with the idea of understanding the power of these systems. The second is the NCBI BLAST server, which is a suite of programs for biomolecular sequence analysis. We demonstrate that there are real life instances where this paradigm promises to be extremely useful

CiteSeerX

ScholarlyCommons@Penn

Recommended from our members

Computing Health Quality Measures Using Informatics for Integrating Biology and the Bedside

Author: Klann Jeffrey Gordon
Murphy Shawn Norman
Publication venue: 'JMIR Publications Inc.'
Publication date: 17/10/2013
Field of study

Background: The Health Quality Measures Format (HQMF) is a Health Level 7 (HL7) standard for expressing computable Clinical Quality Measures (CQMs). Creating tools to process HQMF queries in clinical databases will become increasingly important as the United States moves forward with its Health Information Technology Strategic Plan to Stages 2 and 3 of the Meaningful Use incentive program (MU2 and MU3). Informatics for Integrating Biology and the Bedside (i2b2) is one of the analytical databases used as part of the Office of the National Coordinator (ONC)’s Query Health platform to move toward this goal. Objective: Our goal is to integrate i2b2 with the Query Health HQMF architecture, to prepare for other HQMF use-cases (such as MU2 and MU3), and to articulate the functional overlap between i2b2 and HQMF. Therefore, we analyze the structure of HQMF, and then we apply this understanding to HQMF computation on the i2b2 clinical analytical database platform. Specifically, we develop a translator between two query languages, HQMF and i2b2, so that the i2b2 platform can compute HQMF queries. Methods: We use the HQMF structure of queries for aggregate reporting, which define clinical data elements and the temporal and logical relationships between them. We use the i2b2 XML format, which allows flexible querying of a complex clinical data repository in an easy-to-understand domain-specific language. Results: The translator can represent nearly any i2b2-XML query as HQMF and execute in i2b2 nearly any HQMF query expressible in i2b2-XML. This translator is part of the freely available reference implementation of the QueryHealth initiative. We analyze limitations of the conversion and find it covers many, but not all, of the complex temporal and logical operators required by quality measures. Conclusions: HQMF is an expressive language for defining quality measures, and it will be important to understand and implement for CQM computation, in both meaningful use and population health. However, its current form might allow complexity that is intractable for current database systems (both in terms of implementation and computation). Our translator, which supports the subset of HQMF currently expressible in i2b2-XML, may represent the beginnings of a practical compromise. It is being pilot-tested in two Query Health demonstration projects, and it can be further expanded to balance computational tractability with the advanced features needed by measure developers

Harvard University - DASH

Sound ranking algorithms for XML search

Author: Apers P.M.G.
Flokstra J.
Hiemstra D.
Klinger S.
Rode H.
Publication venue: University of Otago
Publication date: 01/01/2008
Field of study

Ranking algorithms for XML should reflect the actual combined content and structure constraints of queries, while at the same time producing equal rankings for queries that are semantically equal. Ranking algorithms that produce different rankings for queries that are semantically equal are easily detected by tests on large databases: We call such algorithms not sound. We report the behavior of different approaches to ranking content-and-structure queries on pairs of queries for which we expect equal ranking results from the query semantics. We show that most of these approaches are not sound. Of the remaining approaches, only 3 adhere to the W3C XQuery Full-Text standard

KOPS - The Institutional Repository of the University of Konstanz

CiteSeerX

University of Twente Research Information

Survey over Existing Query and Transformation Languages

Author: Bolzer Oliver
Bry François
Furche Tim
Horrocks Ian
Kraus Michael
Orsini Renzo
Schaffert Sebastian
Publication venue
Publication date: 01/01/2004
Field of study

A widely acknowledged obstacle for realizing the vision of the Semantic Web is the inability of many current Semantic Web approaches to cope with data available in such diverging representation formalisms as XML, RDF, or Topic Maps. A common query language is the first step to allow transparent access to data in any of these formats. To further the understanding of the requirements and approaches proposed for query languages in the conventional as well as the Semantic Web, this report surveys a large number of query languages for accessing XML, RDF, or Topic Maps. This is the first systematic survey to consider query languages from all these areas. From the detailed survey of these query languages, a common classification scheme is derived that is useful for understanding and differentiating languages within and among all three areas

CiteSeerX

Open Access LMU

Identification of Design Principles

Author: Badea Liviu
Berger Sacha
Bry François
Furche Tim
Koch Christoph
Schaffert Sebastian
Publication venue
Publication date: 15/08/2004
Field of study

This report identifies those design principles for a (possibly new) query and transformation language for the Web supporting inference that are considered essential. Based upon these design principles an initial strawman is selected. Scenarios for querying the Semantic Web illustrate the design principles and their reflection in the initial strawman, i.e., a first draft of the query language to be designed and implemented by the REWERSE working group I4

Open Access LMU

New model for datasets citation and extraction reproducibility in VAMDC

Author: Dubernet Marie-Lise
Moreau Nicolas
Zwölf Carlo Maria
Publication venue: 'Elsevier BV'
Publication date: 09/05/2016
Field of study

In this paper we present a new paradigm for the identification of datasets extracted from the Virtual Atomic and Molecular Data Centre (VAMDC) e-science infrastructure. Such identification includes information on the origin and version of the datasets, references associated to individual data in the datasets, as well as timestamps linked to the extraction procedure. This paradigm is described through the modifications of the language used to exchange data within the VAMDC and through the services that will implement those modifications. This new paradigm should enforce traceability of datasets, favour reproducibility of datasets extraction, and facilitate the systematic citation of the authors having originally measured and/or calculated the extracted atomic and molecular data.Comment: 48 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-INSU

HAL-OBSPM

Hal-Diderot