35,850 research outputs found
Object-oriented querying of existing relational databases
In this paper, we present algorithms which allow an object-oriented
querying of existing relational databases. Our goal is to provide an improved query
interface for relational systems with better query facilities than SQL. This
seems to be very important since, in real world applications, relational systems
are most commonly used and their dominance will remain in the near future. To
overcome the drawbacks of relational systems, especially the poor query facilities
of SQL, we propose a schema transformation and a query translation algorithm.
The schema transformation algorithm uses additional semantic information to enhance
the relational schema and transform it into a corresponding object-oriented
schema. If the additional semantic information can be deducted from an underlying
entity-relationship design schema, the schema transformation may be done
fully automatically. To query the created object-oriented schema, we use the
Structured Object Query Language (SOQL) which provides declarative query facilities
on objects. SOQL queries using the created object-oriented schema are
much shorter, easier to write and understand and more intuitive than corresponding
S Q L queries leading to an enhanced usability and an improved querying of
the database. The query translation algorithm automatically translates SOQL queries
into equivalent SQL queries for the original relational schema
A Framework for XML-based Integration of Data, Visualization and Analysis in a Biomedical Domain
Biomedical data are becoming increasingly complex and heterogeneous in nature. The data are stored in distributed information systems, using a variety of data models, and are processed by increasingly more complex tools that analyze and visualize them. We present in this paper our framework for integrating biomedical research data and tools into a unique Web front end. Our framework is applied to the University of Washingtonâs Human Brain Project. SpeciïŹcally, we present solutions to four integration tasks: deïŹnition of complex mappings from relational sources to XML, distributed XQuery processing, generation of heterogeneous output formats, and the integration of heterogeneous data visualization and analysis tools
Rumble: Data Independence for Large Messy Data Sets
This paper introduces Rumble, an engine that executes JSONiq queries on
large, heterogeneous and nested collections of JSON objects, leveraging the
parallel capabilities of Spark so as to provide a high degree of data
independence. The design is based on two key insights: (i) how to map JSONiq
expressions to Spark transformations on RDDs and (ii) how to map JSONiq FLWOR
clauses to Spark SQL on DataFrames. We have developed a working implementation
of these mappings showing that JSONiq can efficiently run on Spark to query
billions of objects into, at least, the TB range. The JSONiq code is concise in
comparison to Spark's host languages while seamlessly supporting the nested,
heterogeneous data sets that Spark SQL does not. The ability to process this
kind of input, commonly found, is paramount for data cleaning and curation. The
experimental analysis indicates that there is no excessive performance loss,
occasionally even a gain, over Spark SQL for structured data, and a performance
gain over PySpark. This demonstrates that a language such as JSONiq is a simple
and viable approach to large-scale querying of denormalized, heterogeneous,
arborescent data sets, in the same way as SQL can be leveraged for structured
data sets. The results also illustrate that Codd's concept of data independence
makes as much sense for heterogeneous, nested data sets as it does on highly
structured tables.Comment: Preprint, 9 page
XML for Domain Viewpoints
Within research institutions like CERN (European Organization for Nuclear
Research) there are often disparate databases (different in format, type and
structure) that users need to access in a domain-specific manner. Users may
want to access a simple unit of information without having to understand detail
of the underlying schema or they may want to access the same information from
several different sources. It is neither desirable nor feasible to require
users to have knowledge of these schemas. Instead it would be advantageous if a
user could query these sources using his or her own domain models and
abstractions of the data. This paper describes the basis of an XML (eXtended
Markup Language) framework that provides this functionality and is currently
being developed at CERN. The goal of the first prototype was to explore the
possibilities of XML for data integration and model management. It shows how
XML can be used to integrate data sources. The framework is not only applicable
to CERN data sources but other environments too.Comment: 9 pages, 6 figures, conference report from SCI'2001 Multiconference
on Systemics & Informatics, Florid
Towards a Novel Cooperative Logistics Information System Framework
Supply Chains and Logistics have a growing importance in global economy.
Supply Chain Information Systems over the world are heterogeneous and each one
can both produce and receive massive amounts of structured and unstructured
data in real-time, which are usually generated by information systems,
connected objects or manually by humans. This heterogeneity is due to Logistics
Information Systems components and processes that are developed by different
modelling methods and running on many platforms; hence, decision making process
is difficult in such multi-actor environment. In this paper we identify some
current challenges and integration issues between separately designed Logistics
Information Systems (LIS), and we propose a Distributed Cooperative Logistics
Platform (DCLP) framework based on NoSQL, which facilitates real-time
cooperation between stakeholders and improves decision making process in a
multi-actor environment. We included also a case study of Hospital Supply Chain
(HSC), and a brief discussion on perspectives and future scope of work
Constraint-based Query Distribution Framework for an Integrated Global Schema
Distributed heterogeneous data sources need to be queried uniformly using
global schema. Query on global schema is reformulated so that it can be
executed on local data sources. Constraints in global schema and mappings are
used for source selection, query optimization,and querying partitioned and
replicated data sources. The provided system is all XML-based which poses query
in XML form, transforms, and integrates local results in an XML document.
Contributions include the use of constraints in our existing global schema
which help in source selection and query optimization, and a global query
distribution framework for querying distributed heterogeneous data sources.Comment: The Proceedings of the 13th INMIC 2009), Dec. 14-15, 2009, Islamabad,
Pakistan. Pages 1 - 6 Print ISBN: 978-1-4244-4872-2 INSPEC Accession Number:
11072575 Date of Current Version : 15 January 201
- âŠ