5 research outputs found
Data Vaults: A Symbiosis between Database Technology and Scientific File Repositories
In this short paper we outline the Data Vault, a database-attached external file repository.
It provides a true symbiosis between a DBMS and existing file-based repositories.
Data is kept in its original format while scalable processing functionality is provided through the DBMS facilities.
In particular, it provides transparent access to all data kept in the repository through an (array-based)
query language using the file-type specific scientific libraries.
The design space for data vaults is characterized by requirements coming from various fields.
We present a reference architecture for their realization in (commercial) DBMSs and a concrete
implementation in MonetDB for remote sensing data geared at content-based image retrieval
Instant-on scientific data warehouses: Lazy ETL for data-intensive research
In the dawning era of data intensive research, scientific discovery deploys data analysis techniques similar to those that drive business intelligence. Similar to classical Extract, Transform and Load (ETL) processes, data is loaded entirely from external data sources (repositories) into a scientific data warehouse before it can be analyzed. This process is both, time and resource intensive and may not be entirely necessary if only a subset of the data is of interest to a particular user. To overcome this problem, we propose a novel technique to lower the costs for data loading: Lazy ETL. Data is extracted and loaded transparently on-the-fly only for the required data items. Extensive experiments demonstrate the significant reduction of the time from source data availability to query answer compared to state-of-the-art solutions. In addition to reducing the costs for bootstrapping a scientific data warehouse, our approach also reduces the costs for loading new incoming data
Data Vaults: Database Technology for Scientific File Repositories
Current data-management systems and analysis tools fail to meet scientists’ data-intensive needs. A "data vault" approach lets researchers effectively and efficiently explore and analyze information
Data Vaults: A Symbiosis between Database Technology and Scientific File Repositories
textabstractIn this short paper we outline the Data Vault, a database-attached external file repository.
It provides a true symbiosis between a DBMS and existing file-based repositories.
Data is kept in its original format while scalable processing functionality is provided through the DBMS facilities.
In particular, it provides transparent access to all data kept in the repository through an (array-based)
query language using the file-type specific scientific libraries.
The design space for data vaults is characterized by requirements coming from various fields.
We present a reference architecture for their realization in (commercial) DBMSs and a concrete
implementation in MonetDB for remote sensing data geared at content-based image retrieval