Search CORE

6,915 research outputs found

MonetDB/XQuery: a fast XQuery processor powered by a relational engine

Author: Boncz P.
Grust T.
Keulen M. van
Manegold S.
Rittinger J.
Teubner J.
Publication venue: ACM Press
Publication date: 01/01/2006
Field of study

Relational XQuery systems try to re-use mature relational data management infrastructures to create fast and scalable XML database technology. This paper describes the main features, key contributions, and lessons learned while implementing such a system. Its architecture consists of (i) a range-based encoding of XML documents into relational tables, (ii) a compilation technique that translates XQuery into a basic relational algebra, (iii) a restricted (order) property-aware peephole relational query optimization strategy, and (iv) a mapping from XML update statements into relational updates. Thus, this system implements all essential XML database functionalities (rather than a single feature) such that we can learn from the full consequences of our architectural decisions. While implementing this system, we had to extend the state-of-the-art with a number of new technical contributions, such as loop-lifted staircase join and efficient relational query evaluation strategies for XQuery theta-joins with existential semantics. These contributions as well as the architectural lessons learned are also deemed valuable for other relational back-end engines. The performance and scalability of the resulting system is evaluated on the XMark benchmark up to data sizes of 11GB. The performance section also provides an extensive benchmark comparison of all major XMark results published previously, which confirm that the goal of purely relational XQuery processing, namely speed and scalability, was met

CiteSeerX

Crossref

CWI's Institutional Repository

University of Twente Research Information

Recommended from our members

A Generalization of Band Joins and the Merge-Purge Problem

Author: Hernandez Mauricio A.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1995
Field of study

The problem of merging multiple databases of information about common entities is frequently encountered in large commercial and government organizations. The problem we study is often called the Merge/Purge problem and is difficult to solve both in scale and accuracy. Large repositories of data always have numerous duplicate information entries about the same entities that are difficult to cull together without an intelligent "equational theory" that identifies equivalent items by a complex, domain dependent matching process. We have developed a system for accomplishing this task for lists of names of potential customers in a direct marketing-type application. Our results for statistically generated data are shown to be accurate and effective when processing the data multiple times using different keys for sorting. The system provides a rule programming module that is easy to program and quite good at finding duplicates especially in an environment with massive amounts of data

Columbia University Academic Commons

AsterixDB: A Scalable, Open Source BDMS

Author: Alsubaiee Sattam
Altowim Yasser
Altwaijry Hotham
Behm Alexander
Borkar Vinayak
Bu Yingyi
Carey Michael
Cetindil Inci
Cheelangi Madhusudan
Faraaz Khurram
Gabrielova Eugenia
Grover Raman
Heilbron Zachary
Kim Young-Seok
Li Chen
Li Guangqiang
Ok Ji Mahn
Onose Nicola
Pirzadeh Pouria
Tsotras Vassilis
Vernica Rares
Wen Jian
Westmann Till
Publication venue
Publication date: 02/07/2014
Field of study

AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that distinguishes it from other platforms in today's open source Big Data ecosystem. Its features make it well-suited to applications like web data warehousing, social data storage and analysis, and other use cases related to Big Data. AsterixDB has a flexible NoSQL style data model; a query language that supports a wide range of queries; a scalable runtime; partitioned, LSM-based data storage and indexing (including B+-tree, R-tree, and text indexes); support for external as well as natively stored data; a rich set of built-in types; support for fuzzy, spatial, and temporal types and queries; a built-in notion of data feeds for ingestion of data; and transaction support akin to that of a NoSQL store. Development of AsterixDB began in 2009 and led to a mid-2013 initial open source release. This paper is the first complete description of the resulting open source AsterixDB system. Covered herein are the system's data model, its query language, and its software architecture. Also included are a summary of the current status of the project and a first glimpse into how AsterixDB performs when compared to alternative technologies, including a parallel relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data analytics platform, for things that both technologies can do. Also included is a brief description of some initial trials that the system has undergone and the lessons learned (and plans laid) based on those early "customer" engagements

arXiv.org e-Print Archive

CiteSeerX

Research Reports: 1984 NASA/ASEE Summer Faculty Fellowship Program

Author: Dozier J. B.
Freeman L. M.
Karr G. R.
Osborn T. L.
Publication venue
Publication date
Field of study

A NASA/ASEE Summer Faulty Fellowship Program was conducted at the Marshall Space Flight Center (MSFC). The basic objectives of the programs are: (1) to further the professional knowledge of qualified engineering and science faculty members; (2) to stimulate an exchange of ideas between participants and NASA; (3) to enrich and refresh the research and teaching activities of the participants' institutions; and (4) to contribute to the research objectives of the NASA Centers. The Faculty Fellows spent ten weeks at MSFC engaged in a research project compatible with their interests and background and worked in collaboration with a NASA/MSFC colleague. This document is a compilation of Fellows' reports on their research during the summer of 1984. Topics covered include: (1) data base management; (2) computational fluid dynamics; (3) space debris; (4) X-ray gratings; (5) atomic oxygen exposure; (6) protective coatings for SSME; (7) cryogenics; (8) thermal analysis measurements; (9) solar wind modelling; and (10) binary systems

NASA Technical Reports Server

Optimiser-based recommendations of physical database design

Author: Thiem Alexander
Publication venue
Publication date: 15/10/2008
Field of study

Die Komplexitiät aktueller relationaler Datenbank Management Systeme stellt eine immer größere Herausforderung an Datenbankadministratoren dar. Jede Laufzeitumgebung benötigt eine für sie angepasste Konfiguration, um performant zu operieren. Selbst innerhalb einer Umgebung können sich die Anforderungen im Laufe der Zeit ändern und eine erneute Anpassung erfordern. Dies zwingt den DBA sich kontinuierlich und intensiv mit dem System zu beschäftigen. Das Ziel eines modernen DBMS muss die Unterstützung des DBAs sein, um seine Arbeit mit automatisierten Prozessen und Handlungsabläufen zu erleichtern und ihm so stets schnelle und prezise Entscheidungen zu ermöglichen. Diese Arbeit zielt auf die Beschreibung und teilweise Umsetzung eines unterstützenden Systems, das die aktuelle DBMS Konfiguration zusammen mit dem aktuellen Anfrageverhalten analysiert und dem DBA Vorschläge unterbreitet, wie sich die Performanz und Effizienz des Systems verbessern lässt.Today's relational database management systems are made up of many complex components and managing these presents a growing challenge for database administrators. Every runtime environment can require different configurations to deliver adequate performance. Even withinthe same environment demands can shift over time when workloads change. Keeping up with these demands requires continuous effort from the DBA. The goal of a modern DBMS must be to support the DBA in his work with automated processes and workflows that allow him tomake quick and precise decisions. This work aims at describing and partially implementing asupportive system that will analyse the current DBMS configuration together with its workload to give recommendations on how to improve its performance and efficiency.Ilmenau, Techn. Univ., Diplomarbeit, 200

Digitale Bibliothek Thüringen

Exploiting CAFS-ISP

Author: Haworth Guy McCrossan
ICL CUA supported by ICL represented by Guy Haworth
Publication venue: ICL CUA
Publication date: 01/07/1984
Field of study

In the summer of 1982, the ICLCUA CAFS Special Interest Group defined three subject areas for working party activity. These were: 1) interfaces with compilers and databases, 2) end-user language facilities and display methods, and 3) text-handling and office automation. The CAFS SIG convened one working party to address the first subject with the following terms of reference: 1) review facilities and map requirements onto them, 2) "Database or CAFS" or "Database on CAFS", 3) training needs for users to bridge to new techniques, and 4) repair specifications to cover gaps in software. The working party interpreted the topic broadly as the data processing professional's, rather than the end-user's, view of and relationship with CAFS. This report is the result of the working party's activities. The report content for good reasons exceeds the terms of reference in their strictest sense. For example, we examine QUERYMASTER, which is deemed to be an end-user tool by ICL, from both the DP and end-user perspectives. First, this is the only interface to CAFS in the current SV201. Secondly, it is necessary for the DP department to understand the end-user's interface to CAFS. Thirdly, the other subjects have not yet been addressed by other active working parties

Central Archive at the University of Reading

Working Together Toward Better Health Outcomes

Author: Elise Miller
Laura Line
Trishna Nath
Publication venue: Nonprofit Finance Fund
Publication date: 06/06/2017
Field of study

Healthcare organizations and community-based organizations (CBOs) that provide human services are partnering in shared pursuit of better health outcomes. The Partnership for Healthy Outcomes – Nonprofit Finance Fund (NFF), the Center for Health Care Strategies (CHCS), and the Alliance for Strong Families and Communities (Alliance), with support from the Robert Wood Johnson Foundation (RWJF) – set out to capture and analyze the lessons emerging in this dynamic space. Information from more than 200 partnerships serving all 50 US states provide important lessons from, and for, partnerships that hope to improve access to care, address health inequities, and make progress on social issues like food, education, and housing

IssueLab

Designing algorithms for big graph datasets : a study of computing bisimulation and joins

Author: Luo Y.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2015
Field of study

Repository TU/e

Pure OAI Repository