3 research outputs found

    Intuitionistic fuzzy XML query matching and rewriting

    Get PDF
    With the emergence of XML as a standard for data representation, particularly on the web, the need for intelligent query languages that can operate on XML documents with structural heterogeneity has recently gained a lot of popularity. Traditional Information Retrieval and Database approaches have limitations when dealing with such scenarios. Therefore, fuzzy (flexible) approaches have become the predominant. In this thesis, we propose a new approach for approximate XML query matching and rewriting which aims at achieving soft matching of XML queries with XML data sources following different schemas. Unlike traditional querying approaches, which require exact matching, the proposed approach makes use of Intuitionistic Fuzzy Trees to achieve approximate (soft) query matching. Through this new approach, not only the exact answer of a query, but also approximate answers are retrieved. Furthermore, partial results can be obtained from multiple data sources and merged together to produce a single answer to a query. The proposed approach introduced a new tree similarity measure that considers the minimum and maximum degrees of similarity/inclusion of trees that are based on arc matching. New techniques for soft node and arc matching were presented for matching queries against data sources with highly varied structures. A prototype was developed to test the proposed ideas and it proved the ability to achieve approximate matching for pattern queries with a number of XML schemas and rewrite the original query so that it obtain results from the underlying data sources. This has been achieved through several novel algorithms which were tested and proved efficiency and low CPU/Memory cost even for big number of data sources

    Optimisation techniques for flexible SPARQL queries

    Get PDF
    RDF datasets can be queried using the SPARQL language but are often irregularly structured and incomplete, which may make precise query formulation hard for users. The SPARQLAR^{AR} language extends SPARQL 1.1 with two operators - APPROX and RELAX - so as to allow flexible querying over property paths. These operators encapsulate different dimensions of query flexibility, namely approximation and generalisation, and they allow users to query complex, heterogeneous knowledge graphs without needing to know precisely how the data is structured. Earlier work has described the syntax, semantics and complexity of SPARQLAR^{AR}, has demonstrated its practical feasibility, but has also highlighted the need for improving the speed of query evaluation. In the present paper, we focus on the design of two optimisation techniques targeted at speeding up the execution of SPARQLAR^{AR} queries and on their empirical evaluation on three knowledge graphs: LUBM, DBpedia and YAGO. We show that applying these optimisations can result in substantial improvements in the execution times of longer-running queries (sometimes by one or more orders of magnitude) without incurring significant performance penalties for fast queries

    Adaptive relaxation for querying heterogeneous XML data sources

    No full text
    Searching XML data with a structured XML query can improve the precision of results compared with a keyword search. However, the structural heterogeneity of the large number of XML data sources makes it difficult to answer the structured query exactly. As such, query relaxation is necessary. Previous work on XML query relaxation poses the problem of unnecessary computation of a big number of unqualified relaxed queries. To address this issue, we propose an adaptive relaxation approach which relaxes a query against different data sources differently based on their conformed schemas. In this paper, we present a set of techniques that supports this approach, which includes schema-aware relaxation rules for relaxing a query adaptively, a weighted model for ranking relaxed queries, and algorithms for adaptive relaxation of a query and top-k query processing. We discuss results from a comprehensive set of experiments that show the effectiveness and the efficiency of our approac