7,971 research outputs found
LogBase: A Scalable Log-structured Database System in the Cloud
Numerous applications such as financial transactions (e.g., stock trading)
are write-heavy in nature. The shift from reads to writes in web applications
has also been accelerating in recent years. Write-ahead-logging is a common
approach for providing recovery capability while improving performance in most
storage systems. However, the separation of log and application data incurs
write overheads observed in write-heavy environments and hence adversely
affects the write throughput and recovery time in the system. In this paper, we
introduce LogBase - a scalable log-structured database system that adopts
log-only storage for removing the write bottleneck and supporting fast system
recovery. LogBase is designed to be dynamically deployed on commodity clusters
to take advantage of elastic scaling property of cloud environments. LogBase
provides in-memory multiversion indexes for supporting efficient access to data
maintained in the log. LogBase also supports transactions that bundle read and
write operations spanning across multiple records. We implemented the proposed
system and compared it with HBase and a disk-based log-structured
record-oriented system modeled after RAMCloud. The experimental results show
that LogBase is able to provide sustained write throughput, efficient data
access out of the cache, and effective system recovery.Comment: VLDB201
ElfStore: A Resilient Data Storage Service for Federated Edge and Fog Resources
Edge and fog computing have grown popular as IoT deployments become
wide-spread. While application composition and scheduling on such resources are
being explored, there exists a gap in a distributed data storage service on the
edge and fog layer, instead depending solely on the cloud for data persistence.
Such a service should reliably store and manage data on fog and edge devices,
even in the presence of failures, and offer transparent discovery and access to
data for use by edge computing applications. Here, we present Elfstore, a
first-of-its-kind edge-local federated store for streams of data blocks. It
uses reliable fog devices as a super-peer overlay to monitor the edge
resources, offers federated metadata indexing using Bloom filters, locates data
within 2-hops, and maintains approximate global statistics about the
reliability and storage capacity of edges. Edges host the actual data blocks,
and we use a unique differential replication scheme to select edges on which to
replicate blocks, to guarantee a minimum reliability and to balance storage
utilization. Our experiments on two IoT virtual deployments with 20 and 272
devices show that ElfStore has low overheads, is bound only by the network
bandwidth, has scalable performance, and offers tunable resilience.Comment: 24 pages, 14 figures, To appear in IEEE International Conference on
Web Services (ICWS), Milan, Italy, 201
Partout: A Distributed Engine for Efficient RDF Processing
The increasing interest in Semantic Web technologies has led not only to a
rapid growth of semantic data on the Web but also to an increasing number of
backend applications with already more than a trillion triples in some cases.
Confronted with such huge amounts of data and the future growth, existing
state-of-the-art systems for storing RDF and processing SPARQL queries are no
longer sufficient. In this paper, we introduce Partout, a distributed engine
for efficient RDF processing in a cluster of machines. We propose an effective
approach for fragmenting RDF data sets based on a query log, allocating the
fragments to nodes in a cluster, and finding the optimal configuration. Partout
can efficiently handle updates and its query optimizer produces efficient query
execution plans for ad-hoc SPARQL queries. Our experiments show the superiority
of our approach to state-of-the-art approaches for partitioning and distributed
SPARQL query processing
- …