3,232 research outputs found
LogBase: A Scalable Log-structured Database System in the Cloud
Numerous applications such as financial transactions (e.g., stock trading)
are write-heavy in nature. The shift from reads to writes in web applications
has also been accelerating in recent years. Write-ahead-logging is a common
approach for providing recovery capability while improving performance in most
storage systems. However, the separation of log and application data incurs
write overheads observed in write-heavy environments and hence adversely
affects the write throughput and recovery time in the system. In this paper, we
introduce LogBase - a scalable log-structured database system that adopts
log-only storage for removing the write bottleneck and supporting fast system
recovery. LogBase is designed to be dynamically deployed on commodity clusters
to take advantage of elastic scaling property of cloud environments. LogBase
provides in-memory multiversion indexes for supporting efficient access to data
maintained in the log. LogBase also supports transactions that bundle read and
write operations spanning across multiple records. We implemented the proposed
system and compared it with HBase and a disk-based log-structured
record-oriented system modeled after RAMCloud. The experimental results show
that LogBase is able to provide sustained write throughput, efficient data
access out of the cache, and effective system recovery.Comment: VLDB201
ElasTraS: An Elastic Transactional Data Store in the Cloud
Over the last couple of years, "Cloud Computing" or "Elastic Computing" has
emerged as a compelling and successful paradigm for internet scale computing.
One of the major contributing factors to this success is the elasticity of
resources. In spite of the elasticity provided by the infrastructure and the
scalable design of the applications, the elephant (or the underlying database),
which drives most of these web-based applications, is not very elastic and
scalable, and hence limits scalability. In this paper, we propose ElasTraS
which addresses this issue of scalability and elasticity of the data store in a
cloud computing environment to leverage from the elastic nature of the
underlying infrastructure, while providing scalable transactional data access.
This paper aims at providing the design of a system in progress, highlighting
the major design choices, analyzing the different guarantees provided by the
system, and identifying several important challenges for the research community
striving for computing in the cloud.Comment: 5 Pages, In Proc. of USENIX HotCloud 200
SAFIUS - A secure and accountable filesystem over untrusted storage
We describe SAFIUS, a secure accountable file system that resides over an
untrusted storage. SAFIUS provides strong security guarantees like
confidentiality, integrity, prevention from rollback attacks, and
accountability. SAFIUS also enables read/write sharing of data and provides the
standard UNIX-like interface for applications. To achieve accountability with
good performance, it uses asynchronous signatures; to reduce the space required
for storing these signatures, a novel signature pruning mechanism is used.
SAFIUS has been implemented on a GNU/Linux based system modifying OpenGFS.
Preliminary performance studies show that SAFIUS has a tolerable overhead for
providing secure storage: while it has an overhead of about 50% of OpenGFS in
data intensive workloads (due to the overhead of performing
encryption/decryption in software), it is comparable (or better in some cases)
to OpenGFS in metadata intensive workloads.Comment: 11pt, 12 pages, 16 figure
Platforms for Teaching Distributed Computing Concepts to Undergraduate Students
Over the last two decades, information technology has been moving towards distributed computing to host their applications and services. These systems can process more data more reliably than their central processing counterparts; however, distributed applications are more complex to design and develop because they require additional properties like replication and fault tolerance to work effectively. These complexities translate to the educational setting, where schools need to invest in additional infrastructure, knowledge, and technologies to teach distributed concepts to students.
This project presents the design and implementation of a complete educational framework for the teaching of distributed computing concepts at Cal Poly. The framework consists of three components: a Raspberry Pi cluster, a custom distributed file system (DecaFS), and a set of labs that can be used to support coursework in a distributed computing class. Each cluster is composed of five networked Raspberry Pi computers. The DecaFS distributed file system runs on the Raspberry Pi cluster. DecaFS provides the base functionality of a distributed file system with a design that allows for easy modification of sections of the implementation. The lab exercises focus on important distributed computing concepts that represent a variety of problems encountered in distributed systems including distribution, replication, fault tolerance, recovery, rebalancing, and efficiency. Isolation of the lab related modules allows students to focus on the learning objectives of the labs without needing to set up network and file system infrastructure to support the distributed aspects.
The complexities of teaching distributed computing concepts in a classroom setting at Cal Poly have been addressed with this project\u27s framework. The solution overcomes key educational challenges as it is portable, modular, scalable and affordable. The framework provides the ability to offer courses in distributed computing to better prepare students for the challenges presented in industry today. Through the use of a modular distributed file system and computing cluster that were created for this project, students are able to solve complex distributed problems, in the form of labs, in an isolated environment that is conducive to quarter long learning objectives. This work is a major step to bringing distributed computing into the classrooms at Cal Poly and classes are currently being designed around this curriculum. Cal Poly can evolve the framework to keep pace with the ever advancing information technology world so that it may continue to serve the needs of the faculty and students of Cal Poly
- …