Search CORE

77 research outputs found

An Empirical Study of Real-World SPARQL Queries

Author: Arias Mario
de la Fuente Pablo
Fernández Javier D.
Martínez-Prieto Miguel A.
Publication venue
Publication date: 01/01/2011
Field of study

Understanding how users tailor their SPARQL queries is crucial when designing query evaluation engines or fine-tuning RDF stores with performance in mind. In this paper we analyze 3 million real-world SPARQL queries extracted from logs of the DBPedia and SWDF public endpoints. We aim at finding which are the most used language elements both from syntactical and structural perspectives, paying special attention to triple patterns and joins, since they are indeed some of the most expensive SPARQL operations at evaluation phase. We have determined that most of the queries are simple and include few triple patterns and joins, being Subject-Subject, Subject-Object and Object-Object the most common join types. The graph patterns are usually star-shaped and despite triple pattern chains exist, they are generally short.Comment: 1st International Workshop on Usage Analysis and the Web of Data (USEWOD2011) in the 20th International World Wide Web Conference (WWW2011), Hyderabad, India, March 28th, 201

arXiv.org e-Print Archive

CiteSeerX

Ranking Archived Documents for Structured Queries on Semantic Layers

Author: Arikan Irem
Balog Krisztian
Fafalios P.
Feyznia Azam
Halpin Harry
Latifi Sara
Mulay Kunal
Ngonga Ngomo Axel-Cyrille
Tran Nam Khanh
Publication venue
Publication date: 23/10/2018
Field of study

Archived collections of documents (like newspaper and web archives) serve as important information sources in a variety of disciplines, including Digital Humanities, Historical Science, and Journalism. However, the absence of efficient and meaningful exploration methods still remains a major hurdle in the way of turning them into usable sources of information. A semantic layer is an RDF graph that describes metadata and semantic information about a collection of archived documents, which in turn can be queried through a semantic query language (SPARQL). This allows running advanced queries by combining metadata of the documents (like publication date) and content-based semantic information (like entities mentioned in the documents). However, the results returned by such structured queries can be numerous and moreover they all equally match the query. In this paper, we deal with this problem and formalize the task of "ranking archived documents for structured queries on semantic layers". Then, we propose two ranking models for the problem at hand which jointly consider: i) the relativeness of documents to entities, ii) the timeliness of documents, and iii) the temporal relations among the entities. The experimental results on a new evaluation dataset show the effectiveness of the proposed models and allow us to understand their limitation

arXiv.org e-Print Archive

Crossref

Evaluating and benchmarking SPARQL query containment solvers

Author: Chekol Melisachew Wudage
Euzenat Jérôme
Genevès Pierre
Layaïda Nabil
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

International audienceQuery containment is the problem of deciding if the answers to a query are included in those of another query for any queried database. This problem is very important for query optimization purposes. In the SPARQL context, it can be equally useful. This problem has recently been investigated theoretically and some query containment solvers are available. Yet, there were no benchmarks to compare theses systems and foster their improvement. In order to experimentally assess implementation strengths and limitations, we provide a first SPARQL containment test benchmark. It has been designed with respect to both the capabilities of existing solvers and the study of typical queries. Some solvers support optional constructs and cycles, while other solvers support projection, union of conjunctive queries and RDF Schemas. No solver currently supports all these features or OWL entailment regimes. The study of query demographics on DBPedia logs shows that the vast majority of queries are acyclic and a significant part of them uses UNION or projection. We thus test available solvers on their domain of applicability on three different benchmark suites. These experiments show that (i) tested solutions are overall functionally correct, (ii) in spite of its complexity, SPARQL query containment is practicable for acyclic queries, (iii) state-of-the-art solvers are at an early stage both in ter

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1

Federating Queries to RDF repositories

Author: Buil-Aranda C.
Corcho Oscar
Publication venue: Facultad de Informática (UPM)
Publication date: 11/06/2010
Field of study

Currently large amounts of RDF data are being published in the Web. These data is commonly accessed by means of SPARQL endpoints. However to query a set of SPARQL endpoints new mechanisms are needed due to neither the SPARQL protocol nor the language provide any norms or guidelines about how to proceed. In this paper we present an approach for federating queries to a set of SPARQL endpoints, using relational database distributed query processing techniques and part of the WS-DAI specification for web-service based access to relational and XML databases

Archivo Digital UPM

Recommended from our members

Formalizing Gremlin pattern matching traversals in an integrated graph Algebra

Author: Auer Sören
Thakkar Harsh
Vidal Maria-Esther
Publication venue: Aachen, Germany : RWTH Aachen
Publication date: 01/01/2019
Field of study

Graph data management (also called NoSQL) has revealed beneﬁcial characteristics in terms of ﬂexibility and scalability by diﬀer-ently balancing between query expressivity and schema ﬂexibility. This peculiar advantage has resulted into an unforeseen race of developing new task-speciﬁc graph systems, query languages and data models, such as property graphs, key-value, wide column, resource description framework (RDF), etc. Present-day graph query languages are focused towards ﬂex-ible graph pattern matching (aka sub-graph matching), whereas graph computing frameworks aim towards providing fast parallel (distributed) execution of instructions. The consequence of this rapid growth in the variety of graph-based data management systems has resulted in a lack of standardization. Gremlin, a graph traversal language, and machine provide a common platform for supporting any graph computing sys-tem (such as an OLTP graph database or OLAP graph processors). In this extended report, we present a formalization of graph pattern match-ing for Gremlin queries. We also study, discuss and consolidate various existing graph algebra operators into an integrated graph algebra

Repositorium für Naturwissenschaften und Technik

The tractability frontier of well-designed SPARQL queries

Author: Abiteboul S.
Flum J.
Kaminski M.
Kolaitis P.
Kostylev E.
Kroll M.
Pichler S. S. R.
Publication venue
Publication date: 27/03/2018
Field of study

We study the complexity of query evaluation of SPARQL queries. We focus on the fundamental fragment of well-designed SPARQL restricted to the AND, OPTIONAL and UNION operators. Our main result is a structural characterisation of the classes of well-designed queries that can be evaluated in polynomial time. In particular, we introduce a new notion of width called domination width, which relies on the well-known notion of treewidth. We show that, under some complexity theoretic assumptions, the classes of well-designed queries that can be evaluated in polynomial time are precisely those of bounded domination width

arXiv.org e-Print Archive

Crossref