80 research outputs found

    AsterixDB: A Scalable, Open Source BDMS

    Full text link
    AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that distinguishes it from other platforms in today's open source Big Data ecosystem. Its features make it well-suited to applications like web data warehousing, social data storage and analysis, and other use cases related to Big Data. AsterixDB has a flexible NoSQL style data model; a query language that supports a wide range of queries; a scalable runtime; partitioned, LSM-based data storage and indexing (including B+-tree, R-tree, and text indexes); support for external as well as natively stored data; a rich set of built-in types; support for fuzzy, spatial, and temporal types and queries; a built-in notion of data feeds for ingestion of data; and transaction support akin to that of a NoSQL store. Development of AsterixDB began in 2009 and led to a mid-2013 initial open source release. This paper is the first complete description of the resulting open source AsterixDB system. Covered herein are the system's data model, its query language, and its software architecture. Also included are a summary of the current status of the project and a first glimpse into how AsterixDB performs when compared to alternative technologies, including a parallel relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data analytics platform, for things that both technologies can do. Also included is a brief description of some initial trials that the system has undergone and the lessons learned (and plans laid) based on those early "customer" engagements

    Automatic mapping of XML documents into relational database

    Get PDF
    Extensible Markup Language (XML) nowadays is one of the most important standard media used for exchanging and representing data through the Internet. Storing, updating and retrieving the huge amount of web services data such as XML is an attractive area of research for researchers and database vendors. In this thesis, we propose and develop a new mapping model, called MAXDOR, for storing, rebuilding, updating and querying XML documents using a relational database without making use of any XML schemas in the mapping process. The model addressed the problem of solving the structural hole between ordered hierarchical XML and unordered tabular relational database to enable us to use relational database systems for storing, updating and querying XML data. A multiple link list is used to maintain XML document structure, manage the process of updating document contents and retrieve document contents efficiently. Experiments are done to evaluate MAXDOR model. MAXDOR will be compared with other well-known models available in the literature(Tatarinov et al., 2002) and (Torsten et al., 2004) using total expected value of rebuilding XML document execution time and insertion of token execution time.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Efficient Storage of XML - A Comparative Study

    Get PDF
    The purpose of this study is to predict the performance of XML storage in various real time scenarios. This study is a survey and comparative analysis of data storage using databases to store and retrieve XML, using Java objects representing XML and other storage mechanisms that may have not yet been explored. It also gives a high-level overview of how to use XML with databases or Java Objects and describes how the differences between data-centric and document-centric XML affect their usage, when used with databases and objects, and how XML is used with relational and object oriented databases, Java Objects, and the role of native XML databases (stand alone XML databases). A detailed comparative study on storage of XML using Relational DBMS, Native XML DBMS and processing into Java Objects using JAXB was conducted. The data models such as relational, hierarchical, document-driven were used as inputs to the study. There is no single tool that can manage all the aspects of XML data used in an application. Each technology provides interestingly unique features. There is a tremendous amount of research and development in progress, in the development of tools and technologies to use XML. It can be safely predicted that all the technologies will finally merge into one standard method of storage of XML that will incorporate all the features such as, faster searches, full-text searches, maintaining original document order, ability to maintain a collection of documents, ability to query and store or retrieve over the network using protocols such as HTTP, SOAP etc., provide integral support for casting of elements, support for processing valid and non-valid XML documents, all in a single tool. This study has successfully concluded that the most efficient way to store XML data lies in the context of its usage.Computer Science Departmen

    Assessing the Flexibility of a Service Oriented Architecture to that of the Classic Data Warehouse

    Get PDF
    The flexibility of a service oriented architecture (SOA) is compared to that of the classic data warehouse across three categories: (1) source system access, (2) integration and transformation, and (3) end user access. The findings suggest that an SOA allows better upgrade and migration flexibility if back-end systems expose their source data via adapters. However, the providers of such adapters must deal with the complexity of maintaining consistent interfaces. An SOA also appears to provide more flexibility at the integration tier due to its ability to merge batch with real-time source system data. This has the potential to retain source system data semantics (e.g., code translations and business rules) without having to reproduce such logic in a transformation tier. Additionally, the tight coupling of operational metadata and source system data within XML in an SOA allows more flexibility in downstream analysis and auditing of output . SOA does lag behind the classic data warehouse at the end user level, mainly due to the latter\u27s use of mature SQL and relational database technology. Users of all technical levels can easily work with these technologies in the classic data warehouse environment to query data in a number of ways. The SOA end user likely requires developer support for such activities

    Querying and managing opm-compliant scientific workflow provenance

    Get PDF
    Provenance, the metadata that records the derivation history of scientific results, is important in scientific workflows to interpret, validate, and analyze the result of scientific computing. Recently, to promote and facilitate interoperability among heterogeneous provenance systems, the Open Provenance Model (OPM) has been proposed and has played an important role in the community. In this dissertation, to efficiently query and manage OPM-compliant provenance, we first propose a provenance collection framework that collects both prospective provenance, which captures an abstract workflow specification as a recipe for future data derivation and retrospective provenance, which captures past workflow execution and data derivation information. We then propose a relational database-based provenance system, called OPMPROV that stores, reasons, and queries prospective and retrospective provenance, which is OPM-compliant provenance. We finally propose OPQL, an OPM-level provenance query language, that is directly defined over the OPM model. An OPQL query takes an OPM graph as input and produces an OPM graph as output; therefore, OPQL queries are not tightly coupled to the underlying provenance storage strategies. Our provenance store, provenance collection framework, and provenance query language feature the native support of the OPM model

    XFormsDB - An XForms-Based Framework for Simplifying Web Application Development

    Get PDF
    WWW:n luonne muuttuu jatkuvasti vastatakseen paremmin käyttäjien kasvavia tarpeita. Vaikka tämä kehitys kohti hyödyllisempiä vuorovaikutteisia palveluita ja sovelluksia on parantanut WWW:n käyttö- ja käyttäjäkokemusta, niin se on myös samalla tehnyt WWW-sovellusten kehittämisestä paljon monimutkaisempaa. Tämän työn päätavoitteena oli tutkia, miten WWW-sovellusten kehittämistä voitaisiin helpottaa deklaratiivisen ohjelmoinnin keinoin. Työssä esitetään laajennus, jonka avulla yleisimmät palvelinpään toiminnallisuudet voidaan saumattomasti liittää osaksi XForms-merkintäkieltä. Myös laajennuksen käyttökelpoisuus ja mahdollisuudet validoidaan prototyyppitoteutuksen, nimeltään XFormsDB-ohjelmistokehys, ja kahden WWW-esimerkkisovelluksen avulla. Tulokset osoittavat, että XFormsDB-ohjelmistokehyksen avulla voidaan kirjoittaa hyödyllisiä, erittäin vuorovaikutteisia monen käyttäjän WWW-sovelluksia nopeasti ja helposti vain yhtä dokumenttia ja yhtä ohjelmointimallia käyttäen.The nature of the World Wide Web is constantly changing to meet the increasing demands of its users. While this trend towards more useful interactive services and applications has improved the utility and the user experience of the Web, it has also made the development of Web applications much more complex. The main objective of this Thesis was to study how Web application development could be simplified by means of declarative programming. An extension that seamlessly integrates common server-side functionalities to the XForms markup language is proposed and its feasibility and capabilities are validated with a proof-of-concept implementation, called the XFormsDB framework, and two sample Web applications. The results show that useful, highly interactive multi-user Web applications can be authored quickly and easily in a single document and under a single programming model using the XFormsDB framework

    Reasoning & Querying – State of the Art

    Get PDF
    Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

    Accelerating data retrieval steps in XML documents

    Get PDF