28,215 research outputs found
LogBase: A Scalable Log-structured Database System in the Cloud
Numerous applications such as financial transactions (e.g., stock trading)
are write-heavy in nature. The shift from reads to writes in web applications
has also been accelerating in recent years. Write-ahead-logging is a common
approach for providing recovery capability while improving performance in most
storage systems. However, the separation of log and application data incurs
write overheads observed in write-heavy environments and hence adversely
affects the write throughput and recovery time in the system. In this paper, we
introduce LogBase - a scalable log-structured database system that adopts
log-only storage for removing the write bottleneck and supporting fast system
recovery. LogBase is designed to be dynamically deployed on commodity clusters
to take advantage of elastic scaling property of cloud environments. LogBase
provides in-memory multiversion indexes for supporting efficient access to data
maintained in the log. LogBase also supports transactions that bundle read and
write operations spanning across multiple records. We implemented the proposed
system and compared it with HBase and a disk-based log-structured
record-oriented system modeled after RAMCloud. The experimental results show
that LogBase is able to provide sustained write throughput, efficient data
access out of the cache, and effective system recovery.Comment: VLDB201
Knowledge and Metadata Integration for Warehousing Complex Data
With the ever-growing availability of so-called complex data, especially on
the Web, decision-support systems such as data warehouses must store and
process data that are not only numerical or symbolic. Warehousing and analyzing
such data requires the joint exploitation of metadata and domain-related
knowledge, which must thereby be integrated. In this paper, we survey the types
of knowledge and metadata that are needed for managing complex data, discuss
the issue of knowledge and metadata integration, and propose a CWM-compliant
integration solution that we incorporate into an XML complex data warehousing
framework we previously designed.Comment: 6th International Conference on Information Systems Technology and
its Applications (ISTA 07), Kharkiv : Ukraine (2007
Search and Result Presentation in Scientific Workflow Repositories
We study the problem of searching a repository of complex hierarchical
workflows whose component modules, both composite and atomic, have been
annotated with keywords. Since keyword search does not use the graph structure
of a workflow, we develop a model of workflows using context-free bag grammars.
We then give efficient polynomial-time algorithms that, given a workflow and a
keyword query, determine whether some execution of the workflow matches the
query. Based on these algorithms we develop a search and ranking solution that
efficiently retrieves the top-k grammars from a repository. Finally, we propose
a novel result presentation method for grammars matching a keyword query, based
on representative parse-trees. The effectiveness of our approach is validated
through an extensive experimental evaluation
- …