197 research outputs found

    LogBase: A Scalable Log-structured Database System in the Cloud

    Full text link
    Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. The shift from reads to writes in web applications has also been accelerating in recent years. Write-ahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. However, the separation of log and application data incurs write overheads observed in write-heavy environments and hence adversely affects the write throughput and recovery time in the system. In this paper, we introduce LogBase - a scalable log-structured database system that adopts log-only storage for removing the write bottleneck and supporting fast system recovery. LogBase is designed to be dynamically deployed on commodity clusters to take advantage of elastic scaling property of cloud environments. LogBase provides in-memory multiversion indexes for supporting efficient access to data maintained in the log. LogBase also supports transactions that bundle read and write operations spanning across multiple records. We implemented the proposed system and compared it with HBase and a disk-based log-structured record-oriented system modeled after RAMCloud. The experimental results show that LogBase is able to provide sustained write throughput, efficient data access out of the cache, and effective system recovery.Comment: VLDB201

    Privacy-Enhanced Dependable and Searchable Storage in a Cloud-of-Clouds

    Get PDF
    In this dissertation we will propose a solution for a trustable and privacy-enhanced storage architecture based on a multi-cloud approach. The solution provides the necessary support for multi modal on-line searching operation on data that is always maintained encrypted on used cloud-services. We implemented a system prototype, conducting an experimental evaluation. Our results show that the proposal offers security and privacy guarantees, and provides efficient information retrieval capabilities without sacrificing precision and recall properties on the supported search operations. There is a constant increase in the demand of cloud services, particularly cloud-based storage services. These services are currently used by different applications as outsourced storage services, with some interesting advantages. Most personal and mobile applications also offer the user the choice to use the cloud to store their data, transparently and sometimes without entire user awareness and privacy-conditions, to overcome local storage limitations. Companies might also find that it is cheaper to outsource databases and keyvalue stores, instead of relying on storage solutions in private data-centers. This raises the concern about data privacy guarantees and data leakage danger. A cloud system administrator can easily access unprotected data and she/he could also forge, modify or delete data, violating privacy, integrity, availability and authenticity conditions. A possible solution to solve those problems would be to encrypt and add authenticity and integrity proofs in all data, before being sent to the cloud, and decrypting and verifying authenticity or integrity on data downloads. However this solution can be used only for backup purposes or when big data is not involved, and might not be very practical for online searching requirements over large amounts of cloud stored data that must be searched, accessed and retrieved in a dynamic way. Those solutions also impose high-latency and high amount of cloud inbound/outbound traffic, increasing the operational costs. Moreover, in the case of mobile or embedded devices, the power, computation and communication constraints cannot be ignored, since indexing, encrypting/decrypting and signing/verifying all data will be computationally expensive. To overcome the previous drawbacks, in this dissertation we propose a solution for a trustable and privacy-enhanced storage architecture based on a multi-cloud approach, providing privacy-enhanced, dependable and searchable support. Our solution provides the necessary support for dependable cloud storage and multi modal on-line searching operations over always-encrypted data in a cloud-of-clouds. We implemented a system prototype, conducting an experimental evaluation of the proposed solution, involving the use of conventional storage clouds, as well as, a high-speed in-memory cloud-storage backend. Our results show that the proposal offers the required dependability properties and privacy guarantees, providing efficient information retrieval capabilities without sacrificing precision and recall properties in the supported indexing and search operations

    Master of Science

    Get PDF
    thesisEfficient movement of massive amounts of data over high-speed networks at high throughput is essential for a modern-day in-memory storage system. In response to the growing needs of throughput and latency demands at scale, a new class of database systems was developed in recent years. The development of these systems was guided by increased access to high throughput, low latency network fabrics, and declining cost of Dynamic Random Access Memory (DRAM). These systems were designed with On-Line Transactional Processing (OLTP) workloads in mind, and, as a result, are optimized for fast dispatch and perform well under small request-response scenarios. However, massive server responses such as those for range queries and data migration for load balancing poses challenges for this design. This thesis analyzes the effects of large transfers on scale-out systems through the lens of a modern Network Interface Card (NIC). The present-day NIC offers new and exciting opportunities and challenges for large transfers, but using them efficiently requires smart data layout and concurrency control. We evaluated the impact of modern NICs in designing data layout by measuring transmit performance and full system impact by observing the effects of Direct Memory Access (DMA), Remote Direct Memory Access (RDMA), and caching improvements such as Intel® Data Direct I/O (DDIO). We discovered that use of techniques such as Zero Copy yield around 25% savings in CPU cycles and a 50% reduction in the memory bandwidth utilization on a server by using a client-assisted design with records that are not updated in place. We also set up experiments that underlined the bottlenecks in the current approach to data migration in RAMCloud and propose guidelines for a fast and efficient migration protocol for RAMCloud

    HAPPY: Hybrid Address-based Page Policy in DRAMs

    Full text link
    Memory controllers have used static page closure policies to decide whether a row should be left open, open-page policy, or closed immediately, close-page policy, after the row has been accessed. The appropriate choice for a particular access can reduce the average memory latency. However, since application access patterns change at run time, static page policies cannot guarantee to deliver optimum execution time. Hybrid page policies have been investigated as a means of covering these dynamic scenarios and are now implemented in state-of-the-art processors. Hybrid page policies switch between open-page and close-page policies while the application is running, by monitoring the access pattern of row hits/conflicts and predicting future behavior. Unfortunately, as the size of DRAM memory increases, fine-grain tracking and analysis of memory access patterns does not remain practical. We propose a compact memory address-based encoding technique which can improve or maintain the performance of DRAMs page closure predictors while reducing the hardware overhead in comparison with state-of-the-art techniques. As a case study, we integrate our technique, HAPPY, with a state-of-the-art monitor, the Intel-adaptive open-page policy predictor employed by the Intel Xeon X5650, and a traditional Hybrid page policy. We evaluate them across 70 memory intensive workload mixes consisting of single-thread and multi-thread applications. The experimental results show that using the HAPPY encoding applied to the Intel-adaptive page closure policy can reduce the hardware overhead by 5X for the evaluated 64 GB memory (up to 40X for a 512 GB memory) while maintaining the prediction accuracy
    corecore