13 research outputs found

    Approximation and relaxation of semantic web path queries

    Get PDF
    Given the heterogeneity of complex graph data on the web, such as RDF linked data, it is likely that a user wishing to query such data will lack full knowledge of the structure of the data and of its irregularities. Hence, providing flexible querying capabilities that assist users in formulating their information seeking requirements is highly desirable. In this paper we undertake a detailed theoretical investigation of query approximation, query relaxation, and their combination, for this purpose. The query language we adopt comprises conjunctions of regular path queries, thus encompassing recent extensions to SPARQL to allow for querying paths in graphs using regular expressions (SPARQL 1.1). To this language we add standard notions of query approximation based on edit distance, as well as query relaxation based on RDFS inference rules. We show how both of these notions can be integrated into a single theoretical framework and we provide incremental evaluation algorithms that run in polynomial time in the size of the query and the data, returning answers in ranked order of their `distance' from the original query. We also combine for the first time these two disparate notions into a single `flex' operation that simultaneously applies both approximation and relaxation to a query conjunct, providing even greater flexibility for users, but still retaining polynomial time evaluation complexity and the ability to return query answers in ranked order

    Distributed pattern mining and data publication in life sciences using big data technologies

    Get PDF

    On the formulation of performant sparql queries

    Get PDF
    AbstractThe combination of the flexibility of RDF and the expressiveness of SPARQL provides a powerful mechanism to model, integrate and query data. However, these properties also mean that it is nontrivial to write performant SPARQL queries. Indeed, it is quite easy to create queries that tax even the most optimised triple stores. Currently, application developers have little concrete guidance on how to write “good” queries. The goal of this paper is to begin to bridge this gap. It describes 5 heuristics that can be applied to create optimised queries. The heuristics are informed by formal results in the literature on the semantics and complexity of evaluating SPARQL queries, which ensures that queries following these rules can be optimised effectively by an underlying RDF store. Moreover, we empirically verify the efficacy of the heuristics using a set of openly available datasets and corresponding SPARQL queries developed by a large pharmacology data integration project. The experimental results show improvements in performance across six state-of-the-art RDF stores

    Data

    No full text
    <p>The datasets used in the evaluation of the 5 heuristics proposed in "On the formulation of performant SPARQL queries" by A. Loizou, R. Angles, and P. Groth.</p

    Results

    No full text
    <p>The results obtained from the evaluation of the 5 heuristics proposed in "On the formulation of performant SPARQL queries" by A. Loizou,  R. Angles, and P. Groth </p
    corecore