Search CORE

28,215 research outputs found

LogBase: A Scalable Log-structured Database System in the Cloud

Author: Agrawal Divyakant
Chen Gang
Ooi Beng Chin
Vo Hoang Tam
Wang Sheng
Publication venue
Publication date: 01/01/2012
Field of study

Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. The shift from reads to writes in web applications has also been accelerating in recent years. Write-ahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. However, the separation of log and application data incurs write overheads observed in write-heavy environments and hence adversely affects the write throughput and recovery time in the system. In this paper, we introduce LogBase - a scalable log-structured database system that adopts log-only storage for removing the write bottleneck and supporting fast system recovery. LogBase is designed to be dynamically deployed on commodity clusters to take advantage of elastic scaling property of cloud environments. LogBase provides in-memory multiversion indexes for supporting efficient access to data maintained in the log. LogBase also supports transactions that bundle read and write operations spanning across multiple records. We implemented the proposed system and compared it with HBase and a disk-based log-structured record-oriented system modeled after RAMCloud. The experimental results show that LogBase is able to provide sustained write throughput, efficient data access out of the cache, and effective system recovery.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

ScholarBank@NUS

Knowledge and Metadata Integration for Warehousing Complex Data

Author: Darmont Jérôme
Ralaivao Jean-Christian
Publication venue
Publication date: 01/01/2007
Field of study

With the ever-growing availability of so-called complex data, especially on the Web, decision-support systems such as data warehouses must store and process data that are not only numerical or symbolic. Warehousing and analyzing such data requires the joint exploitation of metadata and domain-related knowledge, which must thereby be integrated. In this paper, we survey the types of knowledge and metadata that are needed for managing complex data, discuss the issue of knowledge and metadata integration, and propose a CWM-compliant integration solution that we incorporate into an XML complex data warehousing framework we previously designed.Comment: 6th International Conference on Information Systems Technology and its Applications (ISTA 07), Kharkiv : Ukraine (2007

arXiv.org e-Print Archive

HAL Descartes

Search and Result Presentation in Scientific Workflow Repositories

Author: Davidson Susan B.
Huang Xiaocheng
Stoyanovich Julia
Yuan Xiaojie
Publication venue
Publication date: 01/01/2013
Field of study

We study the problem of searching a repository of complex hierarchical workflows whose component modules, both composite and atomic, have been annotated with keywords. Since keyword search does not use the graph structure of a workflow, we develop a model of workflows using context-free bag grammars. We then give efficient polynomial-time algorithms that, given a workflow and a keyword query, determine whether some execution of the workflow matches the query. Based on these algorithms we develop a search and ranking solution that efficiently retrieves the top-k grammars from a repository. Finally, we propose a novel result presentation method for grammars matching a keyword query, based on representative parse-trees. The effectiveness of our approach is validated through an extensive experimental evaluation

arXiv.org e-Print Archive

Crossref

ScholarlyCommons@Penn