Search CORE

80 research outputs found

AsterixDB: A Scalable, Open Source BDMS

Author: Alsubaiee Sattam
Altowim Yasser
Altwaijry Hotham
Behm Alexander
Borkar Vinayak
Bu Yingyi
Carey Michael
Cetindil Inci
Cheelangi Madhusudan
Faraaz Khurram
Gabrielova Eugenia
Grover Raman
Heilbron Zachary
Kim Young-Seok
Li Chen
Li Guangqiang
Ok Ji Mahn
Onose Nicola
Pirzadeh Pouria
Tsotras Vassilis
Vernica Rares
Wen Jian
Westmann Till
Publication venue
Publication date: 02/07/2014
Field of study

AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that distinguishes it from other platforms in today's open source Big Data ecosystem. Its features make it well-suited to applications like web data warehousing, social data storage and analysis, and other use cases related to Big Data. AsterixDB has a flexible NoSQL style data model; a query language that supports a wide range of queries; a scalable runtime; partitioned, LSM-based data storage and indexing (including B+-tree, R-tree, and text indexes); support for external as well as natively stored data; a rich set of built-in types; support for fuzzy, spatial, and temporal types and queries; a built-in notion of data feeds for ingestion of data; and transaction support akin to that of a NoSQL store. Development of AsterixDB began in 2009 and led to a mid-2013 initial open source release. This paper is the first complete description of the resulting open source AsterixDB system. Covered herein are the system's data model, its query language, and its software architecture. Also included are a summary of the current status of the project and a first glimpse into how AsterixDB performs when compared to alternative technologies, including a parallel relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data analytics platform, for things that both technologies can do. Also included is a brief description of some initial trials that the system has undergone and the lessons learned (and plans laid) based on those early "customer" engagements

arXiv.org e-Print Archive

CiteSeerX

Automatic mapping of XML documents into relational database

Author: Dweib Ibrahim Mohammad
Publication venue
Publication date: 01/01/2010
Field of study

Extensible Markup Language (XML) nowadays is one of the most important standard media used for exchanging and representing data through the Internet. Storing, updating and retrieving the huge amount of web services data such as XML is an attractive area of research for researchers and database vendors. In this thesis, we propose and develop a new mapping model, called MAXDOR, for storing, rebuilding, updating and querying XML documents using a relational database without making use of any XML schemas in the mapping process. The model addressed the problem of solving the structural hole between ordered hierarchical XML and unordered tabular relational database to enable us to use relational database systems for storing, updating and querying XML data. A multiple link list is used to maintain XML document structure, manage the process of updating document contents and retrieve document contents efficiently. Experiments are done to evaluate MAXDOR model. MAXDOR will be compared with other well-known models available in the literature(Tatarinov et al., 2002) and (Torsten et al., 2004) using total expected value of rebuilding XML document execution time and insertion of token execution time.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OpenGrey Repository

Efficient Storage of XML - A Comparative Study

Author: Nirmal Uma Chinmayee
Publication venue: 'Oklahoma State University Library'
Publication date: 01/05/2005
Field of study

The purpose of this study is to predict the performance of XML storage in various real time scenarios. This study is a survey and comparative analysis of data storage using databases to store and retrieve XML, using Java objects representing XML and other storage mechanisms that may have not yet been explored. It also gives a high-level overview of how to use XML with databases or Java Objects and describes how the differences between data-centric and document-centric XML affect their usage, when used with databases and objects, and how XML is used with relational and object oriented databases, Java Objects, and the role of native XML databases (stand alone XML databases). A detailed comparative study on storage of XML using Relational DBMS, Native XML DBMS and processing into Java Objects using JAXB was conducted. The data models such as relational, hierarchical, document-driven were used as inputs to the study. There is no single tool that can manage all the aspects of XML data used in an application. Each technology provides interestingly unique features. There is a tremendous amount of research and development in progress, in the development of tools and technologies to use XML. It can be safely predicted that all the technologies will finally merge into one standard method of storage of XML that will incorporate all the features such as, faster searches, full-text searches, maintaining original document order, ability to maintain a collection of documents, ability to query and store or retrieve over the network using protocols such as HTTP, SOAP etc., provide integral support for casting of elements, support for processing valid and non-valid XML documents, all in a single tool. This study has successfully concluded that the most efficient way to store XML data lies in the context of its usage.Computer Science Departmen

SHAREOK repository

Assessing the Flexibility of a Service Oriented Architecture to that of the Classic Data Warehouse

Author: Pastore Michael
Publication venue: ePublications at Regis University
Publication date: 19/05/2010
Field of study

The flexibility of a service oriented architecture (SOA) is compared to that of the classic data warehouse across three categories: (1) source system access, (2) integration and transformation, and (3) end user access. The findings suggest that an SOA allows better upgrade and migration flexibility if back-end systems expose their source data via adapters. However, the providers of such adapters must deal with the complexity of maintaining consistent interfaces. An SOA also appears to provide more flexibility at the integration tier due to its ability to merge batch with real-time source system data. This has the potential to retain source system data semantics (e.g., code translations and business rules) without having to reproduce such logic in a transformation tier. Additionally, the tight coupling of operational metadata and source system data within XML in an SOA allows more flexibility in downstream analysis and auditing of output . SOA does lag behind the classic data warehouse at the end user level, mainly due to the latter\u27s use of mature SQL and relational database technology. Users of all technical levels can easily work with these technologies in the classic data warehouse environment to query data in a number of ways. The SOA end user likely requires developer support for such activities

ePublications at Regis University

Recommended from our members

A flexible approach for mapping between object-oriented databases and xml. A two way method based on an object graph.

Author: Naser Taher A.J.
Publication venue: School of Computing, Informatics and Media
Publication date: 01/01/2011
Field of study

One of the most popular challenges facing academia and industry is the development of effective techniques and tools for maximizing the availability of data as the most valuable source of knowledge. The internet has dominated as the core for maximizing data availability and XML (eXtensible Markup Language) has emerged and is being gradually accepted as the universal standard format for platform independent publishing and exchanging data over the Internet. On the other hand, there remain large amount of data held in structured databases and database management systems have been traditionally used for the effective storage and manipulation of large volumes of data. This raised the need for effective methodologies capable of smoothly transforming data between different formats in general and between XML and structured databases in particular. This dissertation addresses the issue by proposing a two-way mapping approach between XML and object-oriented databases. The basic steps of the proposed approach are applied in a systematic way to produce a graph from the source and then transform the graph into the destination format. In other words, the derived graph summarizes characteristics of the source whether XML (elements and attributes) or object-oriented database (classes, inheritance and nesting hierarchies). Then, the developed methodology classifies nodes and links from the graph into the basic constructs of the destination, i.e., elements and attributes for XML or classes, inheritance and nesting hierarchies for object-oriented databases. The methodology has been successfully implemented and illustrative case studies are presented in this document

Bradford Scholars

Querying and managing opm-compliant scientific workflow provenance

Author: Lim Chunhyeok
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2012
Field of study

Provenance, the metadata that records the derivation history of scientific results, is important in scientific workflows to interpret, validate, and analyze the result of scientific computing. Recently, to promote and facilitate interoperability among heterogeneous provenance systems, the Open Provenance Model (OPM) has been proposed and has played an important role in the community. In this dissertation, to efficiently query and manage OPM-compliant provenance, we first propose a provenance collection framework that collects both prospective provenance, which captures an abstract workflow specification as a recipe for future data derivation and retrospective provenance, which captures past workflow execution and data derivation information. We then propose a relational database-based provenance system, called OPMPROV that stores, reasons, and queries prospective and retrospective provenance, which is OPM-compliant provenance. We finally propose OPQL, an OPM-level provenance query language, that is directly defined over the OPM model. An OPQL query takes an OPM graph as input and produces an OPM graph as output; therefore, OPQL queries are not tightly coupled to the underlying provenance storage strategies. Our provenance store, provenance collection framework, and provenance query language feature the native support of the OPM model

Digital Commons@Wayne State University

XFormsDB - An XForms-Based Framework for Simplifying Web Application Development

Author: Laine Markku Pekka Mikael
Publication venue: Aalto-yliopisto
Publication date: 01/01/2010
Field of study

WWW:n luonne muuttuu jatkuvasti vastatakseen paremmin käyttäjien kasvavia tarpeita. Vaikka tämä kehitys kohti hyödyllisempiä vuorovaikutteisia palveluita ja sovelluksia on parantanut WWW:n käyttö- ja käyttäjäkokemusta, niin se on myös samalla tehnyt WWW-sovellusten kehittämisestä paljon monimutkaisempaa. Tämän työn päätavoitteena oli tutkia, miten WWW-sovellusten kehittämistä voitaisiin helpottaa deklaratiivisen ohjelmoinnin keinoin. Työssä esitetään laajennus, jonka avulla yleisimmät palvelinpään toiminnallisuudet voidaan saumattomasti liittää osaksi XForms-merkintäkieltä. Myös laajennuksen käyttökelpoisuus ja mahdollisuudet validoidaan prototyyppitoteutuksen, nimeltään XFormsDB-ohjelmistokehys, ja kahden WWW-esimerkkisovelluksen avulla. Tulokset osoittavat, että XFormsDB-ohjelmistokehyksen avulla voidaan kirjoittaa hyödyllisiä, erittäin vuorovaikutteisia monen käyttäjän WWW-sovelluksia nopeasti ja helposti vain yhtä dokumenttia ja yhtä ohjelmointimallia käyttäen.The nature of the World Wide Web is constantly changing to meet the increasing demands of its users. While this trend towards more useful interactive services and applications has improved the utility and the user experience of the Web, it has also made the development of Web applications much more complex. The main objective of this Thesis was to study how Web application development could be simplified by means of declarative programming. An extension that seamlessly integrates common server-side functionalities to the XForms markup language is proposed and its feasibility and capabilities are validated with a proof-of-concept implementation, called the XFormsDB framework, and two sample Web applications. The results show that useful, highly interactive multi-user Web applications can be authored quickly and easily in a single document and under a single programming model using the XFormsDB framework

Aaltodoc Publication Archive

Reasoning & Querying – State of the Art

Author: Bry François
Furche Tim
Weiand Klara
Publication venue
Publication date: 31/08/2008
Field of study

Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

Open Access LMU

Accelerating data retrieval steps in XML documents

Author: Shen Yun
Publication venue
Publication date: 01/01/2005
Field of study

Repository@Hull - Worktribe