45 research outputs found
Composition and Inversion of Schema Mappings
In the recent years, a lot of attention has been paid to the development of
solid foundations for the composition and inversion of schema mappings. In this
paper, we review the proposals for the semantics of these crucial operators.
For each of these proposals, we concentrate on the three following problems:
the definition of the semantics of the operator, the language needed to express
the operator, and the algorithmic issues associated to the problem of computing
the operator. It should be pointed out that we primarily consider the
formalization of schema mappings introduced in the work on data exchange. In
particular, when studying the problem of computing the composition and inverse
of a schema mapping, we will be mostly interested in computing these operators
for mappings specified by source-to-target tuple-generating dependencies
The Inverse of a Schema Mapping
The inversion of schema mappings has been identified as one of the fundamental operators for the development of a general framework for data exchange, data integration, and more generally, for metadata management. Given a mapping M from a schema S to a schema T,
an inverse of M is a new mapping that describes the reverse relationship fromT to S, and that is semantically consistent with the relationship previously established by M. In practical scenarios, the inversion of a schema mapping can have several applications.
For example, in a data exchange context, if a mapping M is used to exchange data from a source to a target schema, an inverse of M can be used to exchange the data back to the source, thus reversing the application of M.
The formalization of a clear semantics for the inverse operator has proved to be a very difficult task. In fact, during the last years, several alternative notions of inversion for schema mappings have been proposed in the literature. This chapter provides a survey on the different formalizations for the inverse operator and the main theoretical and practical results obtained so far. In particular, we present and compare the main proposals for inverting schema mappings that have been considered in the literature. For each one of them we present their formal semantics and characterizations of their existence. We also present algorithms to compute inverses and study the language needed to express such inverses
Semantic Query Reformulation in Social PDMS
We consider social peer-to-peer data management systems (PDMS), where each
peer maintains both semantic mappings between its schema and some
acquaintances, and social links with peer friends. In this context,
reformulating a query from a peer's schema into other peer's schemas is a hard
problem, as it may generate as many rewritings as the set of mappings from that
peer to the outside and transitively on, by eventually traversing the entire
network. However, not all the obtained rewritings are relevant to a given
query. In this paper, we address this problem by inspecting semantic mappings
and social links to find only relevant rewritings. We propose a new notion of
'relevance' of a query with respect to a mapping, and, based on this notion, a
new semantic query reformulation approach for social PDMS, which achieves great
accuracy and flexibility. To find rapidly the most interesting mappings, we
combine several techniques: (i) social links are expressed as FOAF (Friend of a
Friend) links to characterize peer's friendship and compact mapping summaries
are used to obtain mapping descriptions; (ii) local semantic views are special
views that contain information about external mappings; and (iii) gossiping
techniques improve the search of relevant mappings. Our experimental
evaluation, based on a prototype on top of PeerSim and a simulated network
demonstrate that our solution yields greater recall, compared to traditional
query translation approaches proposed in the literature.Comment: 29 pages, 8 figures, query rewriting in PDM
Use of Schema Associative Mapping for Synchronization of the Virtual Machine Audit Logs
Abstract. Compute cloud interoperability across different domains represents a major challenge for the System administrator community. This work takes a look at the issues for enabling heterogeneous synchronization of virtual disk log attributes by use of an associative mapping technique. We explore this concern as a function of providing secure log auditing for the virtual machine (VM) cloud. Our contribution provides novel theoretical foundations that can be used to establish these synchronized log audit requirements supported by practical case study results
METIS in PArADISE: Provenance Management bei der Auswertung von Sensordatenmengen für die Entwicklung von Assistenzsystemen
n diesem Beitrag soll ein langfristiges Forschungsvorhaben im Bereich der Informatik und Elektrotechnik an der Universität Rostock vorgestellt werden, in dem wissenschaftliche Experimente in der Informatik, der Zellbiologie und der Medizin (neurodegenerative Erkrankungen) durch effiziente Analyseverfahren auf sehr großen Mengen von Mess- oder Sensordaten unterstützt und im Sinne des Provenance Management rückverfolgbar gemacht werden sollen. Im Bereich der Informatik ist das experimentelle Anwendungsgebiet das der Erforschung und systematischen Entwicklung von Assistenzsystemen. Da in Assistenzsystemen unterstützte Personen durch eine Vielzahl von Sensoren beobachtet werden, müssen auch Privatheitsaspekte bereits während der Phase der Modellbildung berücksichtigt werden, um diese bei der
konkreten Konstruktion des Assistenzsystems automatisch in den Systementwurf zu integrieren.
Die Datenbankteilaspekte dieses Forschungsgebietes werden im Beitrag näher beleuchtet: Neben der effizienten Auswertung großer Mengen von Mess- und Sensordaten sind dies das Provenance Management und die Integration von Privatheitsbedingungen. Um diese Problemkreise zu verknüpfen, treffen zwei extrem unterschiedliche Datenbankthemen aufeinander: (1) Ableitung inverser Schema- und Instanzabbildungen, die üblicherweise in der Datenbankintegration, -föderation und -evolution benötigt werden, aus dem Projekt METIS, sowie (2) Effizienz von Analyseverfahren und Integration von Privatheitsaspekten durch Anfragetransformationen für die Entwicklung von Assistenzsystemen im Projekt PArADISE. Im Beitrag werden wir den gemeinsamen Kern beider Themen in den theoretischen Grundlagen von Datenbanken identifizieren
Composition with Target Constraints
It is known that the composition of schema mappings, each specified by
source-to-target tgds (st-tgds), can be specified by a second-order tgd (SO
tgd). We consider the question of what happens when target constraints are
allowed. Specifically, we consider the question of specifying the composition
of standard schema mappings (those specified by st-tgds, target egds, and a
weakly acyclic set of target tgds). We show that SO tgds, even with the
assistance of arbitrary source constraints and target constraints, cannot
specify in general the composition of two standard schema mappings. Therefore,
we introduce source-to-target second-order dependencies (st-SO dependencies),
which are similar to SO tgds, but allow equations in the conclusion. We show
that st-SO dependencies (along with target egds and target tgds) are sufficient
to express the composition of every finite sequence of standard schema
mappings, and further, every st-SO dependency specifies such a composition. In
addition to this expressive power, we show that st-SO dependencies enjoy other
desirable properties. In particular, they have a polynomial-time chase that
generates a universal solution. This universal solution can be used to find the
certain answers to unions of conjunctive queries in polynomial time. It is easy
to show that the composition of an arbitrary number of standard schema mappings
is equivalent to the composition of only two standard schema mappings. We show
that surprisingly, the analogous result holds also for schema mappings
specified by just st-tgds (no target constraints). This is proven by showing
that every SO tgd is equivalent to an unnested SO tgd (one where there is no
nesting of function symbols). Similarly, we prove unnesting results for st-SO
dependencies, with the same types of consequences.Comment: This paper is an extended version of: M. Arenas, R. Fagin, and A.
Nash. Composition with Target Constraints. In 13th International Conference
on Database Theory (ICDT), pages 129-142, 201
Schema Independent Relational Learning
Learning novel concepts and relations from relational databases is an
important problem with many applications in database systems and machine
learning. Relational learning algorithms learn the definition of a new relation
in terms of existing relations in the database. Nevertheless, the same data set
may be represented under different schemas for various reasons, such as
efficiency, data quality, and usability. Unfortunately, the output of current
relational learning algorithms tends to vary quite substantially over the
choice of schema, both in terms of learning accuracy and efficiency. This
variation complicates their off-the-shelf application. In this paper, we
introduce and formalize the property of schema independence of relational
learning algorithms, and study both the theoretical and empirical dependence of
existing algorithms on the common class of (de) composition schema
transformations. We study both sample-based learning algorithms, which learn
from sets of labeled examples, and query-based algorithms, which learn by
asking queries to an oracle. We prove that current relational learning
algorithms are generally not schema independent. For query-based learning
algorithms we show that the (de) composition transformations influence their
query complexity. We propose Castor, a sample-based relational learning
algorithm that achieves schema independence by leveraging data dependencies. We
support the theoretical results with an empirical study that demonstrates the
schema dependence/independence of several algorithms on existing benchmark and
real-world datasets under (de) compositions