80 research outputs found
AsterixDB: A Scalable, Open Source BDMS
AsterixDB is a new, full-function BDMS (Big Data Management System) with a
feature set that distinguishes it from other platforms in today's open source
Big Data ecosystem. Its features make it well-suited to applications like web
data warehousing, social data storage and analysis, and other use cases related
to Big Data. AsterixDB has a flexible NoSQL style data model; a query language
that supports a wide range of queries; a scalable runtime; partitioned,
LSM-based data storage and indexing (including B+-tree, R-tree, and text
indexes); support for external as well as natively stored data; a rich set of
built-in types; support for fuzzy, spatial, and temporal types and queries; a
built-in notion of data feeds for ingestion of data; and transaction support
akin to that of a NoSQL store.
Development of AsterixDB began in 2009 and led to a mid-2013 initial open
source release. This paper is the first complete description of the resulting
open source AsterixDB system. Covered herein are the system's data model, its
query language, and its software architecture. Also included are a summary of
the current status of the project and a first glimpse into how AsterixDB
performs when compared to alternative technologies, including a parallel
relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data
analytics platform, for things that both technologies can do. Also included is
a brief description of some initial trials that the system has undergone and
the lessons learned (and plans laid) based on those early "customer"
engagements
Automatic mapping of XML documents into relational database
Extensible Markup Language (XML) nowadays is one of the most important standard media used for exchanging and representing data through the Internet. Storing, updating and retrieving the huge amount of web services data such as XML is an attractive area of research for researchers and database vendors. In this thesis, we propose and develop a new mapping model, called MAXDOR, for storing, rebuilding, updating and querying XML documents using a relational database without making use of any XML schemas in the mapping process. The model addressed the problem of solving the structural hole between ordered hierarchical XML and unordered tabular relational database to enable us to use relational database systems for storing, updating and querying XML data. A multiple link list is used to maintain XML document structure, manage the process of updating document contents and retrieve document contents efficiently. Experiments are done to evaluate MAXDOR model. MAXDOR will be compared with other well-known models available in the literature(Tatarinov et al., 2002) and (Torsten et al., 2004) using total expected value of rebuilding XML document execution time and insertion of token execution time.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Efficient Storage of XML - A Comparative Study
The purpose of this study is to predict the performance of XML storage in various real time scenarios. This study is a survey and comparative analysis of data storage using databases to store and retrieve XML, using Java objects representing XML and other storage mechanisms that may have not yet been explored. It also gives a high-level overview of how to use XML with databases or Java Objects and describes how the differences between data-centric and document-centric XML affect their usage, when used with databases and objects, and how XML is used with relational and object oriented databases, Java Objects, and the role of native XML databases (stand alone XML databases). A detailed comparative study on storage of XML using Relational DBMS, Native XML DBMS and processing into Java Objects using JAXB was conducted. The data models such as relational, hierarchical, document-driven were used as inputs to the study. There is no single tool that can manage all the aspects of XML data used in an application. Each technology provides interestingly unique features. There is a tremendous amount of research and development in progress, in the development of tools and technologies to use XML. It can be safely predicted that all the technologies will finally merge into one standard method of storage of XML that will incorporate all the features such as, faster searches, full-text searches, maintaining original document order, ability to maintain a collection of documents, ability to query and store or retrieve over the network using protocols such as HTTP, SOAP etc., provide integral support for casting of elements, support for processing valid and non-valid XML documents, all in a single tool. This study has successfully concluded that the most efficient way to store XML data lies in the context of its usage.Computer Science Departmen
Assessing the Flexibility of a Service Oriented Architecture to that of the Classic Data Warehouse
The flexibility of a service oriented architecture (SOA) is compared to that of the classic data warehouse across three categories: (1) source system access, (2) integration and transformation, and (3) end user access. The findings suggest that an SOA allows better upgrade and migration flexibility if back-end systems expose their source data via adapters. However, the providers of such adapters must deal with the complexity of maintaining consistent interfaces. An SOA also appears to provide more flexibility at the integration tier due to its ability to merge batch with real-time source system data. This has the potential to retain source system data semantics (e.g., code translations and business rules) without having to reproduce such logic in a transformation tier. Additionally, the tight coupling of operational metadata and source system data within XML in an SOA allows more flexibility in downstream analysis and auditing of output . SOA does lag behind the classic data warehouse at the end user level, mainly due to the latter\u27s use of mature SQL and relational database technology. Users of all technical levels can easily work with these technologies in the classic data warehouse environment to query data in a number of ways. The SOA end user likely requires developer support for such activities
Recommended from our members
A flexible approach for mapping between object-oriented databases and xml. A two way method based on an object graph.
One of the most popular challenges facing academia and industry is the development
of effective techniques and tools for maximizing the availability of data as the most
valuable source of knowledge. The internet has dominated as the core for
maximizing data availability and XML (eXtensible Markup Language) has emerged
and is being gradually accepted as the universal standard format for platform
independent publishing and exchanging data over the Internet. On the other hand,
there remain large amount of data held in structured databases and database
management systems have been traditionally used for the effective storage and
manipulation of large volumes of data. This raised the need for effective
methodologies capable of smoothly transforming data between different formats in
general and between XML and structured databases in particular. This dissertation
addresses the issue by proposing a two-way mapping approach between XML and
object-oriented databases. The basic steps of the proposed approach are applied in a
systematic way to produce a graph from the source and then transform the graph into
the destination format. In other words, the derived graph summarizes characteristics
of the source whether XML (elements and attributes) or object-oriented database
(classes, inheritance and nesting hierarchies). Then, the developed methodology
classifies nodes and links from the graph into the basic constructs of the destination,
i.e., elements and attributes for XML or classes, inheritance and nesting hierarchies
for object-oriented databases. The methodology has been successfully implemented
and illustrative case studies are presented in this document
Querying and managing opm-compliant scientific workflow provenance
Provenance, the metadata that records the derivation history of scientific results, is important in scientific workflows to interpret, validate, and analyze the result of scientific computing. Recently,
to promote and facilitate interoperability among heterogeneous provenance systems, the Open Provenance Model (OPM) has been proposed and has played an important role in the community.
In this dissertation, to efficiently query and manage OPM-compliant provenance, we first propose a provenance collection framework that collects both prospective provenance, which captures
an abstract workflow specification as a recipe for future data derivation and retrospective provenance, which captures past workflow execution and data derivation information. We then
propose a relational database-based provenance system, called OPMPROV that stores, reasons, and queries prospective and retrospective provenance, which is OPM-compliant provenance. We finally propose OPQL, an OPM-level provenance query language, that is directly defined over the OPM model. An OPQL query takes an
OPM graph as input and produces an OPM graph as output; therefore, OPQL queries are not tightly coupled to the underlying provenance storage strategies. Our provenance store, provenance collection framework, and provenance query language feature the native support of the OPM model
XFormsDB - An XForms-Based Framework for Simplifying Web Application Development
WWW:n luonne muuttuu jatkuvasti vastatakseen paremmin käyttäjien kasvavia tarpeita. Vaikka tämä kehitys kohti hyödyllisempiä vuorovaikutteisia palveluita ja sovelluksia on parantanut WWW:n käyttö- ja käyttäjäkokemusta, niin se on myös samalla tehnyt WWW-sovellusten kehittämisestä paljon monimutkaisempaa.
Tämän työn päätavoitteena oli tutkia, miten WWW-sovellusten kehittämistä voitaisiin helpottaa deklaratiivisen ohjelmoinnin keinoin. Työssä esitetään laajennus, jonka avulla yleisimmät palvelinpään toiminnallisuudet voidaan saumattomasti liittää osaksi XForms-merkintäkieltä. Myös laajennuksen käyttökelpoisuus ja mahdollisuudet validoidaan prototyyppitoteutuksen, nimeltään XFormsDB-ohjelmistokehys, ja kahden WWW-esimerkkisovelluksen avulla.
Tulokset osoittavat, että XFormsDB-ohjelmistokehyksen avulla voidaan kirjoittaa hyödyllisiä, erittäin vuorovaikutteisia monen käyttäjän WWW-sovelluksia nopeasti ja helposti vain yhtä dokumenttia ja yhtä ohjelmointimallia käyttäen.The nature of the World Wide Web is constantly changing to meet the increasing demands of its users. While this trend towards more useful interactive services and applications has improved the utility and the user experience of the Web, it has also made the development of Web applications much more complex.
The main objective of this Thesis was to study how Web application development could be simplified by means of declarative programming. An extension that seamlessly integrates common server-side functionalities to the XForms markup language is proposed and its feasibility and capabilities are validated with a proof-of-concept implementation, called the XFormsDB framework, and two sample Web applications.
The results show that useful, highly interactive multi-user Web applications can be authored quickly and easily in a single document and under a single programming model using the XFormsDB framework
Reasoning & Querying – State of the Art
Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF
- …