9,275 research outputs found
Comparative Analysis of Five XML Query Languages
XML is becoming the most relevant new standard for data representation and
exchange on the WWW. Novel languages for extracting and restructuring the XML
content have been proposed, some in the tradition of database query languages
(i.e. SQL, OQL), others more closely inspired by XML. No standard for XML query
language has yet been decided, but the discussion is ongoing within the World
Wide Web Consortium and within many academic institutions and Internet-related
major companies. We present a comparison of five, representative query
languages for XML, highlighting their common features and differences.Comment: TeX v3.1415, 17 pages, 6 figures, to be published in ACM Sigmod
Record, March 200
Using Ontologies for Semantic Data Integration
While big data analytics is considered as one of the most important paths to competitive advantage of today’s enterprises, data scientists spend a comparatively large amount of time in the data preparation and data integration phase of a big data project. This shows that data integration is still a major challenge in IT applications. Over the past two decades, the idea of using semantics for data integration has become increasingly crucial, and has received much attention in the AI, database, web, and data mining communities. Here, we focus on a specific paradigm for semantic data integration, called Ontology-Based Data Access (OBDA). The goal of this paper is to provide an overview of OBDA, pointing out both the techniques that are at the basis of the paradigm, and the main challenges that remain to be addressed
Temporal Stream Algebra
Data stream management systems (DSMS) so far focus on
event queries and hardly consider combined queries to both
data from event streams and from a database. However,
applications like emergency management require combined
data stream and database queries. Further requirements are
the simultaneous use of multiple timestamps after different
time lines and semantics, expressive temporal relations between multiple time-stamps and
exible negation, grouping
and aggregation which can be controlled, i. e. started and
stopped, by events and are not limited to fixed-size time
windows. Current DSMS hardly address these requirements.
This article proposes Temporal Stream Algebra (TSA) so
as to meet the afore mentioned requirements. Temporal
streams are a common abstraction of data streams and data-
base relations; the operators of TSA are generalizations of
the usual operators of Relational Algebra. A in-depth 'analysis of temporal relations guarantees that valid TSA expressions are non-blocking, i. e. can be evaluated incrementally.
In this respect TSA differs significantly from previous algebraic approaches which use specialized operators to prevent
blocking expressions on a "syntactical" level
Code Generation for Efficient Query Processing in Managed Runtimes
In this paper we examine opportunities arising from the conver-gence of two trends in data management: in-memory database sys-tems (IMDBs), which have received renewed attention following the availability of affordable, very large main memory systems; and language-integrated query, which transparently integrates database queries with programming languages (thus addressing the famous ‘impedance mismatch ’ problem). Language-integrated query not only gives application developers a more convenient way to query external data sources like IMDBs, but also to use the same querying language to query an application’s in-memory collections. The lat-ter offers further transparency to developers as the query language and all data is represented in the data model of the host program-ming language. However, compared to IMDBs, this additional free-dom comes at a higher cost for query evaluation. Our vision is to improve in-memory query processing of application objects by introducing database technologies to managed runtimes. We focus on querying and we leverage query compilation to im-prove query processing on application objects. We explore dif-ferent query compilation strategies and study how they improve the performance of query processing over application data. We take C] as the host programming language as it supports language-integrated query through the LINQ framework. Our techniques de-liver significant performance improvements over the default LINQ implementation. Our work makes important first steps towards a future where data processing applications will commonly run on machines that can store their entire datasets in-memory, and will be written in a single programming language employing language-integrated query and IMDB-inspired runtimes to provide transparent and highly efficient querying. 1
Ontology-Based Data Access and Integration
An ontology-based data integration (OBDI) system is an information management system consisting of three components: an ontology, a set of data sources, and the mapping between the two. The ontology is a conceptual, formal description of the domain of interest to a given organization (or a community of users), expressed in terms of relevant concepts, attributes of concepts, relationships between concepts, and logical assertions characterizing the domain knowledge. The data sources are the repositories accessible by the organization where data concerning the domain are stored. In the general case, such repositories are numerous, heterogeneous, each one managed and maintained independently from the others. The mapping is a precise specification of the correspondence between the data contained in the data sources and the elements of the ontology. The main purpose of an OBDI system is to allow information consumers to query the data using the elements in the ontology as predicates.
In the special case where the organization manages a single data source, the term ontology-based data access (ODBA) system is used
Combining Relational Algebra, SQL, Constraint Modelling, and Local Search
The goal of this paper is to provide a strong integration between constraint
modelling and relational DBMSs. To this end we propose extensions of standard
query languages such as relational algebra and SQL, by adding constraint
modelling capabilities to them. In particular, we propose non-deterministic
extensions of both languages, which are specially suited for combinatorial
problems. Non-determinism is introduced by means of a guessing operator, which
declares a set of relations to have an arbitrary extension. This new operator
results in languages with higher expressive power, able to express all problems
in the complexity class NP. Some syntactical restrictions which make data
complexity polynomial are shown. The effectiveness of both extensions is
demonstrated by means of several examples. The current implementation, written
in Java using local search techniques, is described. To appear in Theory and
Practice of Logic Programming (TPLP)Comment: 30 pages, 5 figure
- …