8,080 research outputs found
State-of-the-art on evolution and reactivity
This report starts by, in Chapter 1, outlining aspects of querying and updating resources on
the Web and on the Semantic Web, including the development of query and update languages
to be carried out within the Rewerse project.
From this outline, it becomes clear that several existing research areas and topics are of
interest for this work in Rewerse. In the remainder of this report we further present state of
the art surveys in a selection of such areas and topics. More precisely: in Chapter 2 we give
an overview of logics for reasoning about state change and updates; Chapter 3 is devoted to briefly describing existing update languages for the Web, and also for updating logic programs;
in Chapter 4 event-condition-action rules, both in the context of active database systems and
in the context of semistructured data, are surveyed; in Chapter 5 we give an overview of some relevant rule-based agents frameworks
Rumble: Data Independence for Large Messy Data Sets
This paper introduces Rumble, an engine that executes JSONiq queries on
large, heterogeneous and nested collections of JSON objects, leveraging the
parallel capabilities of Spark so as to provide a high degree of data
independence. The design is based on two key insights: (i) how to map JSONiq
expressions to Spark transformations on RDDs and (ii) how to map JSONiq FLWOR
clauses to Spark SQL on DataFrames. We have developed a working implementation
of these mappings showing that JSONiq can efficiently run on Spark to query
billions of objects into, at least, the TB range. The JSONiq code is concise in
comparison to Spark's host languages while seamlessly supporting the nested,
heterogeneous data sets that Spark SQL does not. The ability to process this
kind of input, commonly found, is paramount for data cleaning and curation. The
experimental analysis indicates that there is no excessive performance loss,
occasionally even a gain, over Spark SQL for structured data, and a performance
gain over PySpark. This demonstrates that a language such as JSONiq is a simple
and viable approach to large-scale querying of denormalized, heterogeneous,
arborescent data sets, in the same way as SQL can be leveraged for structured
data sets. The results also illustrate that Codd's concept of data independence
makes as much sense for heterogeneous, nested data sets as it does on highly
structured tables.Comment: Preprint, 9 page
A generic persistence model for CLP systems (and two useful implementations)
This paper describes a model of persistence in (C)LP languages and two different and practically very useful ways to implement this model in current systems. The fundamental idea is that persistence is a characteristic of certain dynamic predicates (Le., those which encapsulate
state). The main effect of declaring a predicate persistent is that the dynamic changes made to such predicates persist from one execution to the next one. After proposing a syntax for declaring persistent predicates, a simple, file-based implementation of the concept is presented and
some examples shown. An additional implementation is presented which stores persistent predicates in an external datĂĄbase. The abstraction of the concept of persistence from its implementation allows developing applications
which can store their persistent predicates alternatively in files or databases with only a few simple changes to a declaration stating the location and modality used for persistent storage. The paper presents the model, the implementation approach in both the cases of using files
and relational databases, a number of optimizations of the process (using information obtained from static global analysis and goal clustering), and performance results from an implementation of these ideas
A Symbolic Execution Algorithm for Constraint-Based Testing of Database Programs
In so-called constraint-based testing, symbolic execution is a common
technique used as a part of the process to generate test data for imperative
programs. Databases are ubiquitous in software and testing of programs
manipulating databases is thus essential to enhance the reliability of
software. This work proposes and evaluates experimentally a symbolic ex-
ecution algorithm for constraint-based testing of database programs. First, we
describe SimpleDB, a formal language which offers a minimal and well-defined
syntax and seman- tics, to model common interaction scenarios between pro-
grams and databases. Secondly, we detail the proposed al- gorithm for symbolic
execution of SimpleDB models. This algorithm considers a SimpleDB program as a
sequence of operations over a set of relational variables, modeling both the
database tables and the program variables. By inte- grating this relational
model of the program with classical static symbolic execution, the algorithm
can generate a set of path constraints for any finite path to test in the
control- flow graph of the program. Solutions of these constraints are test
inputs for the program, including an initial content for the database. When the
program is executed with respect to these inputs, it is guaranteed to follow
the path with re- spect to which the constraints were generated. Finally, the
algorithm is evaluated experimentally using representative SimpleDB models.Comment: 12 pages - preliminary wor
Using Ontologies for Semantic Data Integration
While big data analytics is considered as one of the most important paths to competitive advantage of todayâs enterprises, data scientists spend a comparatively large amount of time in the data preparation and data integration phase of a big data project. This shows that data integration is still a major challenge in IT applications. Over the past two decades, the idea of using semantics for data integration has become increasingly crucial, and has received much attention in the AI, database, web, and data mining communities. Here, we focus on a specific paradigm for semantic data integration, called Ontology-Based Data Access (OBDA). The goal of this paper is to provide an overview of OBDA, pointing out both the techniques that are at the basis of the paradigm, and the main challenges that remain to be addressed
- âŠ