Search CORE

SHARQL: Shape Analysis of Recursive SPARQL Queries

Author: Bonifati Angela
Martens Wim
Timm Thomas
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/06/2020
Field of study

International audienceWe showcase SHARQL, a system that allows to navigate SPARQL query logs, can inspect complex queries by visualizing their shape, and can serve as a back-end to flexibly produce statistics about the logs. Even though SPARQL query logs are increasingly available and have become public recently, their navigation and analysis is hampered by the lack of appropriate tools. SPARQL queries are sometimes hard to understand and their inherent properties, such as their shape, their hypertree properties, and their property paths are even more difficult to be identified and properly rendered. In SHARQL, we show how the analysis and exploration of several hundred million queries is possible. We offer edge rendering which works with complex hyperedges, regular edges, and property paths of SPARQL queries. The underlying database stores more than one hundred attributes per query and is therefore extremely flexible for exploring the query logs and as a back-end to compute and display analytical properties of the entire logs or parts thereof

Dagstuhl Research Online Publication Server

A Trichotomy for Regular Trail Queries

Author: Martens Wim
Niewerth Matthias
Trautner Tina
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)
Publication date: 01/01/2020
Field of study

Regular path queries (RPQs) are an essential component of graph query languages. Such queries consider a regular expression r and a directed edge-labeled graph G and search for paths in G for which the sequence of labels is in the language of r. In order to avoid having to consider infinitely many paths, some database engines restrict such paths to be trails, that is, they only consider paths without repeated edges. In this paper we consider the evaluation problem for RPQs under trail semantics, in the case where the expression is fixed. We show that, in this setting, there exists a trichotomy. More precisely, the complexity of RPQ evaluation divides the regular languages into the finite languages, the class T_tract (for which the problem is tractable), and the rest. Interestingly, the tractable class in the trichotomy is larger than for the trichotomy for simple paths, discovered by Bagan et al. [Bagan et al., 2013]. In addition to this trichotomy result, we also study characterizations of the tractable class, its expressivity, the recognition problem, closure properties, and show how the decision problem can be extended to the enumeration problem, which is relevant to practice

Exploration of Large-Scale SPARQL Query Collections : Finding Structure and Regularity for Optimizing Database Systems

Author: Timm Thomas
Publication venue
Publication date: 01/01/2020
Field of study

EPub Bayreuth

An Analytical Study of Large SPARQL Query Logs

Author: Bonifati Angela
Martens Wim
Timm Thomas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2020
Field of study

International audienceWith the adoption of RDF as the data model for Linked Data and the Semantic Web, query specification from end-users has become more and more common in SPARQL endpoints. In this paper, we conduct an in-depth analytical study of the queries formulated by end-users and harvested from large and up-to-date structured query logs from a wide variety of RDF data sources. As opposed to previous studies, ours is the first assessment on a voluminous query corpus, spanning over several years and covering many representative SPARQL endpoints. Apart from the syntactical structure of the queries, that exhibits already interesting results on this generalized corpus, we drill deeper in the structural characteristics related to the graph and hypergraph representation of queries. We outline the most common shapes of queries when visually displayed as undirected graphs, characterize their tree width, length of their cycles, maximal degree of nodes, and more. For queries that cannot be adequately represented as graphs, we investigate their hypergraphs and hypertree width. Moreover, we analyze the evolution of queries over time, by introducing the novel concept of a streak, i.e., a sequence of queries that appear as subsequent modifications of

arXiv.org e-Print Archive

Large Language Models and Knowledge Graphs: Opportunities and Challenges

Author: Biswas Russa
Bonifati Angela
Chen Jiaoyan
de Melo Gerard
Dietze Stefan
Dragoni Mauro
Graux Damien
Jabeen Hajira
Kalo Jan-Christoph
Lissandrini Matteo
Omeliyanenko Janna
Pan Jeff Z.
Razniewski Simon
Singhania Sneha
Vakaj Edlira
Zhang Wen
Publication venue
Publication date: 11/08/2023
Field of study

Large Language Models (LLMs) have taken Knowledge Representation -- and the world -- by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and Knowledge Graphs (explicit knowledge) and speculate on opportunities and visions that the renewed focus brings, as well as related research topics and challenges.Comment: 30 page

Regular Path Query Evaluation on Streaming Graphs

Author: Bonifati Angela
Pacaci Anil
Özsu M. Tamer
Publication venue
Publication date: 04/04/2020
Field of study

We study persistent query evaluation over streaming graphs, which is becoming increasingly important. We focus on navigational queries that determine if there exists a path between two entities that satisfies a user-specified constraint. We adopt the Regular Path Query (RPQ) model that specifies navigational patterns with labeled constraints. We propose deterministic algorithms to efficiently evaluate persistent RPQs under both arbitrary and simple path semantics in a uniform manner. Experimental analysis on real and synthetic streaming graphs shows that the proposed algorithms can process up to tens of thousands of edges per second and efficiently answer RPQs that are commonly used in real-world workloads.Comment: A shorter version of this paper has been accepted for publication in 2020 International Conference on Management of Data (SIGMOD 2020

arXiv.org e-Print Archive