3,232 research outputs found

    LogBase: A Scalable Log-structured Database System in the Cloud

    Full text link
    Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. The shift from reads to writes in web applications has also been accelerating in recent years. Write-ahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. However, the separation of log and application data incurs write overheads observed in write-heavy environments and hence adversely affects the write throughput and recovery time in the system. In this paper, we introduce LogBase - a scalable log-structured database system that adopts log-only storage for removing the write bottleneck and supporting fast system recovery. LogBase is designed to be dynamically deployed on commodity clusters to take advantage of elastic scaling property of cloud environments. LogBase provides in-memory multiversion indexes for supporting efficient access to data maintained in the log. LogBase also supports transactions that bundle read and write operations spanning across multiple records. We implemented the proposed system and compared it with HBase and a disk-based log-structured record-oriented system modeled after RAMCloud. The experimental results show that LogBase is able to provide sustained write throughput, efficient data access out of the cache, and effective system recovery.Comment: VLDB201

    ElasTraS: An Elastic Transactional Data Store in the Cloud

    Full text link
    Over the last couple of years, "Cloud Computing" or "Elastic Computing" has emerged as a compelling and successful paradigm for internet scale computing. One of the major contributing factors to this success is the elasticity of resources. In spite of the elasticity provided by the infrastructure and the scalable design of the applications, the elephant (or the underlying database), which drives most of these web-based applications, is not very elastic and scalable, and hence limits scalability. In this paper, we propose ElasTraS which addresses this issue of scalability and elasticity of the data store in a cloud computing environment to leverage from the elastic nature of the underlying infrastructure, while providing scalable transactional data access. This paper aims at providing the design of a system in progress, highlighting the major design choices, analyzing the different guarantees provided by the system, and identifying several important challenges for the research community striving for computing in the cloud.Comment: 5 Pages, In Proc. of USENIX HotCloud 200

    SAFIUS - A secure and accountable filesystem over untrusted storage

    Get PDF
    We describe SAFIUS, a secure accountable file system that resides over an untrusted storage. SAFIUS provides strong security guarantees like confidentiality, integrity, prevention from rollback attacks, and accountability. SAFIUS also enables read/write sharing of data and provides the standard UNIX-like interface for applications. To achieve accountability with good performance, it uses asynchronous signatures; to reduce the space required for storing these signatures, a novel signature pruning mechanism is used. SAFIUS has been implemented on a GNU/Linux based system modifying OpenGFS. Preliminary performance studies show that SAFIUS has a tolerable overhead for providing secure storage: while it has an overhead of about 50% of OpenGFS in data intensive workloads (due to the overhead of performing encryption/decryption in software), it is comparable (or better in some cases) to OpenGFS in metadata intensive workloads.Comment: 11pt, 12 pages, 16 figure

    Platforms for Teaching Distributed Computing Concepts to Undergraduate Students

    Get PDF
    Over the last two decades, information technology has been moving towards distributed computing to host their applications and services. These systems can process more data more reliably than their central processing counterparts; however, distributed applications are more complex to design and develop because they require additional properties like replication and fault tolerance to work effectively. These complexities translate to the educational setting, where schools need to invest in additional infrastructure, knowledge, and technologies to teach distributed concepts to students. This project presents the design and implementation of a complete educational framework for the teaching of distributed computing concepts at Cal Poly. The framework consists of three components: a Raspberry Pi cluster, a custom distributed file system (DecaFS), and a set of labs that can be used to support coursework in a distributed computing class. Each cluster is composed of five networked Raspberry Pi computers. The DecaFS distributed file system runs on the Raspberry Pi cluster. DecaFS provides the base functionality of a distributed file system with a design that allows for easy modification of sections of the implementation. The lab exercises focus on important distributed computing concepts that represent a variety of problems encountered in distributed systems including distribution, replication, fault tolerance, recovery, rebalancing, and efficiency. Isolation of the lab related modules allows students to focus on the learning objectives of the labs without needing to set up network and file system infrastructure to support the distributed aspects. The complexities of teaching distributed computing concepts in a classroom setting at Cal Poly have been addressed with this project\u27s framework. The solution overcomes key educational challenges as it is portable, modular, scalable and affordable. The framework provides the ability to offer courses in distributed computing to better prepare students for the challenges presented in industry today. Through the use of a modular distributed file system and computing cluster that were created for this project, students are able to solve complex distributed problems, in the form of labs, in an isolated environment that is conducive to quarter long learning objectives. This work is a major step to bringing distributed computing into the classrooms at Cal Poly and classes are currently being designed around this curriculum. Cal Poly can evolve the framework to keep pace with the ever advancing information technology world so that it may continue to serve the needs of the faculty and students of Cal Poly
    corecore