Search CORE

230 research outputs found

View Selection in Semantic Web Databases

Author: François Goasdoué
François Goasdoué
Ioana Manolescu
Julien Leblay
Julien Leblay
Konstantinos Karanasos
Konstantinos Karanasos
Équipes-projets Leo
Publication venue
Publication date: 01/01/2011
Field of study

We consider the setting of a Semantic Web database, containing both explicit data encoded in RDF triples, and implicit data, implied by the RDF semantics. Based on a query workload, we address the problem of selecting a set of views to be materialized in the database, minimizing a combination of query processing, view storage, and view maintenance costs. Starting from an existing relational view selection method, we devise new algorithms for recommending view sets, and show that they scale significantly beyond the existing relational ones when adapted to the RDF context. To account for implicit triples in query answers, we propose a novel RDF query reformulation algorithm and an innovative way of incorporating it into view selection in order to avoid a combinatorial explosion in the complexity of the selection process. The interest of our techniques is demonstrated through a set of experiments.Comment: VLDB201

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

Oxford University Research Archive

HAL-Rennes 1

Database Learning: Toward a Database that Becomes Smarter Every Time

Author: Acharya S.
Agrawal S.
Bishop C. M.
Carbonell J. G.
Carlson A.
Condie T.
Ganti V.
Idreos S.
Lawrence N.
Meliou A.
Micchelli C. A.
Mozafari B.
Mozafari B.
Mozafari B.
Olston C.
Park Y.
Rusu F.
Sarawagi S.
Sidirourgos L.
Skilling J.
Wasserman L.
Williams C. K.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/03/2017
Field of study

In today's databases, previous query answers rarely benefit answering future queries. For the first time, to the best of our knowledge, we change this paradigm in an approximate query processing (AQP) context. We make the following observation: the answer to each query reveals some degree of knowledge about the answer to another query because their answers stem from the same underlying distribution that has produced the entire dataset. Exploiting and refining this knowledge should allow us to answer queries more analytically, rather than by reading enormous amounts of raw data. Also, processing more queries should continuously enhance our knowledge of the underlying distribution, and hence lead to increasingly faster response times for future queries. We call this novel idea---learning from past query answers---Database Learning. We exploit the principle of maximum entropy to produce answers, which are in expectation guaranteed to be more accurate than existing sample-based approximations. Empowered by this idea, we build a query engine on top of Spark SQL, called Verdict. We conduct extensive experiments on real-world query traces from a large customer of a major database vendor. Our results demonstrate that Verdict supports 73.7% of these queries, speeding them up by up to 23.0x for the same accuracy level compared to existing AQP systems.Comment: This manuscript is an extended report of the work published in ACM SIGMOD conference 201

arXiv.org e-Print Archive

Crossref

Indexing forecast models for matching and maintenance

Author: Böhm Matthias
Fischer Ulrike
Lehner Wolfgang
Rosenthal Frank
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/09/2022
Field of study

Forecasts are important to decision-making and risk assessment in many domains. There has been recent interest in integrating forecast queries inside a DBMS. Answering a forecast query requires the creation of forecast models. Creating a forecast model is an expensive process and may require several scans over the base data as well as expensive operations to estimate model parameters. However, if forecast queries are issued repeatedly, answer times can be reduced significantly if forecast models are reused. Due to the possibly high number of forecast queries, existing models need to be found quickly. Therefore, we propose a model index that efficiently stores forecast models and allows for the efficient reuse of existing ones. Our experiments illustrate that the model index shows a negligible overhead for update transactions, but it yields significant improvements during query execution

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

The Inverse of a Schema Mapping

Author
Publication venue: Dagstuhl Follow-Ups. Data Exchange, Integration, and Streams
Publication date: 01/01/2013
Field of study

The inversion of schema mappings has been identified as one of the fundamental operators for the development of a general framework for data exchange, data integration, and more generally, for metadata management. Given a mapping M from a schema S to a schema T, an inverse of M is a new mapping that describes the reverse relationship fromT to S, and that is semantically consistent with the relationship previously established by M. In practical scenarios, the inversion of a schema mapping can have several applications. For example, in a data exchange context, if a mapping M is used to exchange data from a source to a target schema, an inverse of M can be used to exchange the data back to the source, thus reversing the application of M. The formalization of a clear semantics for the inverse operator has proved to be a very difficult task. In fact, during the last years, several alternative notions of inversion for schema mappings have been proposed in the literature. This chapter provides a survey on the different formalizations for the inverse operator and the main theoretical and practical results obtained so far. In particular, we present and compare the main proposals for inverting schema mappings that have been considered in the literature. For each one of them we present their formal semantics and characterizations of their existence. We also present algorithms to compute inverses and study the language needed to express such inverses

Dagstuhl Research Online Publication Server

RDF Querying

Author: A. Wilk
B. Ludäscher
B. Parsia
D. Chamberlin
E. Cohen
F. Bry
F. Bry
G. Gottlob
G. Karvounarakis
G. Karvounarakis
J. Bailey
J. Bruijn de
J. Robie
M. Kifer
M. Lacher
M. Magiridou
M. Marx
S. Abiteboul
S. Berger
T. Grust
V. Bönström
W. May
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Reactive Web systems, Web services, and Web-based publish/ subscribe systems communicate events as XML messages, and in many cases require composite event detection: it is not sufficient to react to single event messages, but events have to be considered in relation to other events that are received over time. Emphasizing language design and formal semantics, we describe the rule-based query language XChangeEQ for detecting composite events. XChangeEQ is designed to completely cover and integrate the four complementary querying dimensions: event data, event composition, temporal relationships, and event accumulation. Semantics are provided as model and fixpoint theories; while this is an established approach for rule languages, it has not been applied for event queries before

CiteSeerX

Crossref

Open Access LMU

Oxford University Research Archive

Query Containment for Highly Expressive Datalog Fragments

Author: Bourhis Pierre
Krötzsch Markus
Rudolph Sebastian
Publication venue
Publication date: 30/06/2014
Field of study

The containment problem of Datalog queries is well known to be undecidable. There are, however, several Datalog fragments for which containment is known to be decidable, most notably monadic Datalog and several "regular" query languages on graphs. Monadically Defined Queries (MQs) have been introduced recently as a joint generalization of these query languages. In this paper, we study a wide range of Datalog fragments with decidable query containment and determine exact complexity results for this problem. We generalize MQs to (Frontier-)Guarded Queries (GQs), and show that the containment problem is 3ExpTime-complete in either case, even if we allow arbitrary Datalog in the sub-query. If we focus on graph query languages, i.e., fragments of linear Datalog, then this complexity is reduced to 2ExpSpace. We also consider nested queries, which gain further expressivity by using predicates that are defined by inner queries. We show that nesting leads to an exponentially increasing hierarchy for the complexity of query containment, both in the linear and in the general case. Our results settle open problems for (nested) MQs, and they paint a comprehensive picture of the state of the art in Datalog query containment.Comment: 20 page

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

Optimizing Queries Using a Materialized View in a Data Warehoue

Author: Hu Jing
Publication venue: 'Oklahoma State University Library'
Publication date: 01/07/2006
Field of study

A data warehouse is a user-centered environment for data analysis and decision support. To support decision maker in making decisions quickly and accurately, using materialized views can provide significant improvements in query processing time. The problem of answering queries using views is to find efficient methods of answering a query using a set of previously materialized views over the database, rather than accessing the database relations. The known algorithms, the bucket algorithm, the inverse-rules algorithm have been used to rewrite queries using views before executing the queries. The bucket algorithm, predominantly used to rewrite queries, generates a candidate rewriting to a query using views then checks that the rewriting is contained in the original query. However, we show same deficiencies in the bucket algorithm then describe the containment bucket algorithm and give an optimal method to solve this problem. We present an experiment comparing the performance of both algorithms.Computer Science Departmen

SHAREOK repository