29,708 research outputs found

    CloudTPS: Scalable Transactions for Web Applications in the Cloud

    Get PDF
    NoSQL Cloud data services provide scalability and high availability properties for web applications but at the same time they sacrifice data consistency. However, many applications cannot afford any data inconsistency. CloudTPS is a scalable transaction manager to allow cloud database services to execute the ACID transactions of web applications, even in the presence of server failures and network partitions. We implement this approach on top of the two main families of scalable data layers: Bigtable and SimpleDB. Performance evaluation on top of HBase (an open-source version of Bigtable) in our local cluster and Amazon SimpleDB in the Amazon cloud shows that our system scales linearly at least up to 40 nodes in our local cluster and 80 nodes in the Amazon cloud

    Efficient Transaction Processing for Short-Lived Transactions in the Cloud

    Get PDF
    The cloud, in the past few years, has become the preferred platform for hosting web applications. Many of these web applications store their data in a distributed cloud storage system, which greatly simplifies application development and provides increased availability and reliability. However, with increasing user demand for web applications, these cloud storage systems often become the performance bottleneck. To address the cloud's performance demands, many storage system features, such as strong consistency and transactional support, are often omitted in favour of performance. Nonetheless, transactions remain necessary to ensure data integrity and application correctness. In this thesis, we introduce CrossStitch, which is an efficient transaction processing framework for distributed key-value storage systems. CrossStitch supports general transactions, where transactions include both computation and key accesses. It is specifically optimized for short-lived transactions that are typical of cloud-deployed web applications. In CrossStitch, a transaction is partitioned into a series of components that form a transaction chain. These components are executed and the transaction is propagated along the storage servers instead of being executed on the application server. This chained structure, in which servers only communicate with their immediate neighbours, enables CrossStitch to implement a pipelined version of two-phase commit to ensure transactional atomicity. CrossStitch is able to eliminate a significant amount of setup overhead using this structure by executing the transaction and the atomic commit protocol concurrently. Therefore, CrossStitch provides low latency and efficient transactional support for cloud storage systems. Our evaluation demonstrates that CrossStitch is a scalable and efficient transaction processing framework for web transactions

    LogBase: A Scalable Log-structured Database System in the Cloud

    Full text link
    Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. The shift from reads to writes in web applications has also been accelerating in recent years. Write-ahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. However, the separation of log and application data incurs write overheads observed in write-heavy environments and hence adversely affects the write throughput and recovery time in the system. In this paper, we introduce LogBase - a scalable log-structured database system that adopts log-only storage for removing the write bottleneck and supporting fast system recovery. LogBase is designed to be dynamically deployed on commodity clusters to take advantage of elastic scaling property of cloud environments. LogBase provides in-memory multiversion indexes for supporting efficient access to data maintained in the log. LogBase also supports transactions that bundle read and write operations spanning across multiple records. We implemented the proposed system and compared it with HBase and a disk-based log-structured record-oriented system modeled after RAMCloud. The experimental results show that LogBase is able to provide sustained write throughput, efficient data access out of the cache, and effective system recovery.Comment: VLDB201

    ElasTraS: An Elastic Transactional Data Store in the Cloud

    Full text link
    Over the last couple of years, "Cloud Computing" or "Elastic Computing" has emerged as a compelling and successful paradigm for internet scale computing. One of the major contributing factors to this success is the elasticity of resources. In spite of the elasticity provided by the infrastructure and the scalable design of the applications, the elephant (or the underlying database), which drives most of these web-based applications, is not very elastic and scalable, and hence limits scalability. In this paper, we propose ElasTraS which addresses this issue of scalability and elasticity of the data store in a cloud computing environment to leverage from the elastic nature of the underlying infrastructure, while providing scalable transactional data access. This paper aims at providing the design of a system in progress, highlighting the major design choices, analyzing the different guarantees provided by the system, and identifying several important challenges for the research community striving for computing in the cloud.Comment: 5 Pages, In Proc. of USENIX HotCloud 200

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed
    corecore