7 research outputs found

    Implementing flexible operators for regular path queries

    Get PDF
    Given the heterogeneity of complex graph data on the web, such as RDF linked data,a user wishing to query such data may lack full knowledge of its structure and irregularities. Hence, providing users with flexible querying capabilities can be beneficial. The query language we adopt comprises conjunctions of regular path queries, thus including extensions proposed for SPARQL 1.1 to allow for querying paths using regular expressions. To this language we add two operators: APPROX, supporting standard notions of approximation based on edit distance, and RELAX, which performs query relaxation based on RDFS inference rules. We describe our techniques for implementing the extended language and present a performance study undertaken on two real-world data sets. Our baseline implementation performs competitively with other automaton-based approaches, and we demonstrate empirically how various optimisations can decrease execution times of queries containing APPROX and RELAX, sometimes by orders of magnitude

    Approximate querying for the Property Graph Language Cypher

    Get PDF
    Graph databases are well-suited to managing large, complex, dynamically evolving datasets. However, for data that is irregular and heterogeneous, it may be difficult to formulate queries that precisely capture a user's information seeking requirements. This points to the need for approximate query processing capabilities that can automatically make changes to a so as to aid in the incremental discovery of relevant information. In this paper we motivate and explore techniques for providing such capabilities for the Cypher query language. This is the first time that query approximation has been investigated in the context of the property graph data model, which is becoming increasingly prevalent in research and industry

    Hive open research network platform.

    Get PDF

    Applications of flexible querying to graph data

    Get PDF
    Graph data models provide flexibility and extensibility that makes them well-suited to modelling data that may be irregular, complex, and evolving in structure and content. However, a consequence of this is that users may not be familiar with the full structure of the data, which itself may be changing over time, making it hard for users to formulate queries that precisely match the data graph and meet their information seeking requirements. There is a need therefore for flexible querying systems over graph data that can automatically make changes to the user's query so as to find additional or different answers, and so help the user to retrieve information of relevance to them. This chapter describes recent work in this area, looking at a variety of graph query languages, applications, flexible querying techniques and implementations

    Optimisation techniques for flexible SPARQL queries

    Get PDF
    RDF datasets can be queried using the SPARQL language but are often irregularly structured and incomplete, which may make precise query formulation hard for users. The SPARQLAR^{AR} language extends SPARQL 1.1 with two operators - APPROX and RELAX - so as to allow flexible querying over property paths. These operators encapsulate different dimensions of query flexibility, namely approximation and generalisation, and they allow users to query complex, heterogeneous knowledge graphs without needing to know precisely how the data is structured. Earlier work has described the syntax, semantics and complexity of SPARQLAR^{AR}, has demonstrated its practical feasibility, but has also highlighted the need for improving the speed of query evaluation. In the present paper, we focus on the design of two optimisation techniques targeted at speeding up the execution of SPARQLAR^{AR} queries and on their empirical evaluation on three knowledge graphs: LUBM, DBpedia and YAGO. We show that applying these optimisations can result in substantial improvements in the execution times of longer-running queries (sometimes by one or more orders of magnitude) without incurring significant performance penalties for fast queries

    Link Prediction of Weighted Triples for Knowledge Graph Completion Within the Scholarly Domain

    Get PDF
    Knowledge graphs (KGs) are widely used for modeling scholarly communication, performing scientometric analyses, and supporting a variety of intelligent services to explore the literature and predict research dynamics. However, they often suffer from incompleteness (e.g., missing affiliations, references, research topics), leading to a reduced scope and quality of the resulting analyses. This issue is usually tackled by computing knowledge graph embeddings (KGEs) and applying link prediction techniques. However, only a few KGE models are capable of taking weights of facts in the knowledge graph into account. Such weights can have different meanings, e.g. describe the degree of association or the degree of truth of a certain triple. In this paper, we propose the Weighted Triple Loss, a new loss function for KGE models that takes full advantage of the additional numerical weights on facts and it is even tolerant to incorrect weights. We also extend the Rule Loss, a loss function that is able to exploit a set of logical rules, in order to work with weighted triples. The evaluation of our solutions on several knowledge graphs indicates significant performance improvements with respect to the state of the art. Our main use case is the large-scale AIDA knowledge graph, which describes 21 million research articles. Our approach enables to complete information about affiliation types, countries, and research topics, greatly improving the scope of the resulting scientometrics analyses and providing better support to systems for monitoring and predicting research dynamics