25 research outputs found
XML Reconstruction View Selection in XML Databases: Complexity Analysis and Approximation Scheme
Query evaluation in an XML database requires reconstructing XML subtrees
rooted at nodes found by an XML query. Since XML subtree reconstruction can be
expensive, one approach to improve query response time is to use reconstruction
views - materialized XML subtrees of an XML document, whose nodes are
frequently accessed by XML queries. For this approach to be efficient, the
principal requirement is a framework for view selection. In this work, we are
the first to formalize and study the problem of XML reconstruction view
selection. The input is a tree , in which every node has a size
and profit , and the size limitation . The target is to find a subset
of subtrees rooted at nodes respectively such that
, and is maximal.
Furthermore, there is no overlap between any two subtrees selected in the
solution. We prove that this problem is NP-hard and present a fully
polynomial-time approximation scheme (FPTAS) as a solution
FunMap: Efficient Execution of Functional Mappings for Knowledge Graph Creation
Data has exponentially grown in the last years, and knowledge graphs
constitute powerful formalisms to integrate a myriad of existing data sources.
Transformation functions -- specified with function-based mapping languages
like FunUL and RML+FnO -- can be applied to overcome interoperability issues
across heterogeneous data sources. However, the absence of engines to
efficiently execute these mapping languages hinders their global adoption. We
propose FunMap, an interpreter of function-based mapping languages; it relies
on a set of lossless rewriting rules to push down and materialize the execution
of functions in initial steps of knowledge graph creation. Although applicable
to any function-based mapping language that supports joins between mapping
rules, FunMap feasibility is shown on RML+FnO. FunMap reduces data redundancy,
e.g., duplicates and unused attributes, and converts RML+FnO mappings into a
set of equivalent rules executable on RML-compliant engines. We evaluate FunMap
performance over real-world testbeds from the biomedical domain. The results
indicate that FunMap reduces the execution time of RML-compliant engines by up
to a factor of 18, furnishing, thus, a scalable solution for knowledge graph
creation
Improving STEM Education in Research: Preliminary Report on the Development of a Computer-Assisted Student-Mentor Research Community
Research education in STEM disciplines currently suffers from 1) The inability to feasibly collect highly detailed data on both the studentâs and mentorâs activities; 2) The lack of tools to assist students and mentors in organizing and managing their research activities and environments; and 3) The inability to correlate a studentâs assessment results with their actual research activities. Together these three problems act to impede both the improvement and educational quality of student research experiences. We propose a computer-assisted student-mentor research community as a solution to these problems. Within this community setting, students and their mentors are provided tools to make their work easier, much like a word processor makes writing a letter easier. Through their use of these tools, details of student-mentor activities are automatically recorded in a relational database, without burdening users with the responsibility of archiving data. Equally important, student assessments of outcome can be directly related to student activity, allowing educators to identify practices resulting in successful research experiences. Community tools also facilitate the use of labor-intensive teaching laboratories involving real inquiry-based research. The community structure has the added benefit of allowing students to see, communicate and interact more freely with other students and their projects, thus enriching the studentâs research experience. We provide herein a preliminary report on the development and testing of a prototype, student-mentor research community, and present its tools, an assessment of student interest in participating in the community, and discuss its further development into a nationally-available student-mentor research community
Answering SPARQL queries over databases under OWL 2 QL entailment regime
We present an extension of the ontology-based data access platform Ontop that supports answering SPARQL queries under the OWL 2 QL direct semantics entailment regime for data instances stored in relational databases. On the theoretical side, we show how any input SPARQL query, OWL 2 QL ontology and R2RML mappings can be rewritten to an equivalent SQL query solely over the data. On the practical side, we present initial experimental results demonstrating that by applying the Ontop technologiesâthe tree-witness query rewriting, T-mappings compiling R2RML mappings with ontology hierarchies, and T-mapping optimisations using SQL expressivity and database integrity
constraintsâthe system produces scalable SQL queries
SPARQL-to-SQL on Internet of Things Databases and Streams
To realise a semantic Web of Things, the challenge of achieving efficient Resource Description Format (RDF) storage and SPARQL query performance on Internet of Things (IoT) devices with limited resources has to be addressed. State-of-the-art SPARQL-to-SQL engines have been shown to outperform RDF stores on some benchmarks. In this paper, we describe an optimisation to the SPARQL-to-SQL approach, based on a study of time-series IoT data structures, that employs metadata abstraction and efficient translation by reusing existing SPARQL engines to produce Linked Data âjust-in-timeâ. We evaluate our approach against RDF stores, state-of-the-art SPARQL-to-SQL engines and streaming SPARQL engines, in the context of IoT data and scenarios. We show that storage efficiency, with succinct row storage, and query performance can be improved from 2 times to 3 orders of magnitude
Bridging the Semantic Web and NoSQL Worlds: Generic SPARQL Query Translation and Application to MongoDB
International audienceRDF-based data integration is often hampered by the lack of methods to translate data locked in heterogeneous silos into RDF representations. In this paper, we tackle the challenge of bridging the gap between the Semantic Web and NoSQL worlds, by fostering the development of SPARQL interfaces to heterogeneous databases. To avoid defining yet another SPARQL translation method for each and every database, we propose a two-phase method. Firstly, a SPARQL query is translated into a pivot abstract query. This phase achieves as much of the translation process as possible regardless of the database. We show how optimizations at this abstract level can save subsequent work at the level of a target database query language. Secondly, the abstract query is translated into the query language of a target database, taking into account the specific database capabilities and constraints. We demonstrate the effectiveness of our method with the MongoDB NoSQL document store, such that arbitrary MongoDB documents can be aligned on existing domain ontologies and accessed with SPARQL. Finally, we draw on a real-world use case to report experimental results with respect to the effectiveness and performance of our approach
Efficient handling of SPARQL OPTIONAL for OBDA
OPTIONAL is a key feature in SPARQL for dealing with missing information. While this operator is used extensively, it is also known for its complexity, which can make efficient evaluation of queries with OPTIONAL challenging. We tackle this problem in the Ontology-Based Data Access (OBDA) setting, where the data is stored in a SQL relational database and exposed as a virtual RDF graph by means of an R2RML mapping. We start with a succinct translation of a SPARQL fragment into SQL. It fully respects bag semantics and three-valued logic and relies on the extensive use of the LEFT JOIN operator and COALESCE function. We then propose optimisation techniques for reducing the size and improving the structure of generated SQL queries. Our optimisations capture interactions between JOIN, LEFT JOIN, COALESCE and integrity constraints such as attribute nullability, uniqueness and foreign key constraints. Finally, we empirically verify effectiveness of our techniques on the BSBM OBDA benchmark