6,568 research outputs found

    Asynchronous replication of metadata across multi-master servers in distributed data storage systems

    Get PDF
    In recent years, scientific applications have become increasingly data intensive. The increase in the size of data generated by scientific applications necessitates collaboration and sharing data among the nation\u27s education and research institutions. To address this, distributed storage systems spanning multiple institutions over wide area networks have been developed. One of the important features of distributed storage systems is providing global unified name space across all participating institutions, which enables easy data sharing without the knowledge of actual physical location of data. This feature depends on the ``location metadata\u27\u27 of all data sets in the system being available to all participating institutions. This introduces new challenges. In this thesis, we study different metadata server layouts in terms of high availability, scalability and performance. A central metadata server is a single point of failure leading to low availability. Ensuring high availability requires replication of metadata servers. A synchronously replicated metadata servers layout introduces synchronization overhead which degrades the performance of data operations. We propose an asynchronously replicated multi-master metadata servers layout which ensures high availability, scalability and provides better performance. We discuss the implications of asynchronously replicated multi-master metadata servers on metadata consistency and conflict resolution. Further, we design and implement our own asynchronous multi-master replication tool, deploy it in the state-wide distributed data storage system called PetaShare, and compare performance of all three metadata server layouts: central metadata server, synchronously replicated multi-master metadata servers and asynchronously replicated multi-master metadata servers

    Towards Transaction as a Service

    Full text link
    This paper argues for decoupling transaction processing from existing two-layer cloud-native databases and making transaction processing as an independent service. By building a transaction as a service (TaaS) layer, the transaction processing can be independently scaled for high resource utilization and can be independently upgraded for development agility. Accordingly, we architect an execution-transaction-storage three-layer cloud-native database. By connecting to TaaS, 1) the AP engines can be empowered with ACID TP capability, 2) multiple standalone TP engine instances can be incorporated to support multi-master distributed TP for horizontal scalability, 3) multiple execution engines with different data models can be integrated to support multi-model transactions, and 4) high performance TP is achieved through extensive TaaS optimizations and consistent evolution. Cloud-native databases deserve better architecture: we believe that TaaS provides a path forward to better cloud-native databases
    • …
    corecore