662 research outputs found
PaRiS: Causally Consistent Transactions with Non-blocking Reads and Partial Replication
Geo-replicated data platforms are at the backbone of several large-scale
online services. Transactional Causal Consistency (TCC) is an attractive
consistency level for building such platforms. TCC avoids many anomalies of
eventual consistency, eschews the synchronization costs of strong consistency,
and supports interactive read-write transactions. Partial replication is
another attractive design choice for building geo-replicated platforms, as it
increases the storage capacity and reduces update propagation costs. This paper
presents PaRiS, the first TCC system that supports partial replication and
implements non-blocking parallel read operations, whose latency is paramount
for the performance of read-intensive applications. PaRiS relies on a novel
protocol to track dependencies, called Universal Stable Time (UST). By means of
a lightweight background gossip process, UST identifies a snapshot of the data
that has been installed by every DC in the system. Hence, transactions can
consistently read from such a snapshot on any server in any replication site
without having to block. Moreover, PaRiS requires only one timestamp to track
dependencies and define transactional snapshots, thereby achieving resource
efficiency and scalability. We evaluate PaRiS on a large-scale AWS deployment
composed of up to 10 replication sites. We show that PaRiS scales well with the
number of DCs and partitions, while being able to handle larger data-sets than
existing solutions that assume full replication. We also demonstrate a
performance gain of non-blocking reads vs. a blocking alternative (up to 1.47x
higher throughput with 5.91x lower latency for read-dominated workloads and up
to 1.46x higher throughput with 20.56x lower latency for write-heavy
workloads)
MDCC: Multi-Data Center Consistency
Replicating data across multiple data centers not only allows moving the data
closer to the user and, thus, reduces latency for applications, but also
increases the availability in the event of a data center failure. Therefore, it
is not surprising that companies like Google, Yahoo, and Netflix already
replicate user data across geographically different regions.
However, replication across data centers is expensive. Inter-data center
network delays are in the hundreds of milliseconds and vary significantly.
Synchronous wide-area replication is therefore considered to be unfeasible with
strong consistency and current solutions either settle for asynchronous
replication which implies the risk of losing data in the event of failures,
restrict consistency to small partitions, or give up consistency entirely. With
MDCC (Multi-Data Center Consistency), we describe the first optimistic commit
protocol, that does not require a master or partitioning, and is strongly
consistent at a cost similar to eventually consistent protocols. MDCC can
commit transactions in a single round-trip across data centers in the normal
operational case. We further propose a new programming model which empowers the
application developer to handle longer and unpredictable latencies caused by
inter-data center communication. Our evaluation using the TPC-W benchmark with
MDCC deployed across 5 geographically diverse data centers shows that MDCC is
able to achieve throughput and latency similar to eventually consistent quorum
protocols and that MDCC is able to sustain a data center outage without a
significant impact on response times while guaranteeing strong consistency
CloudTPS: Scalable Transactions for Web Applications in the Cloud
NoSQL Cloud data services provide scalability and high availability properties for web applications but at the same time they sacrifice data consistency. However, many applications cannot afford any data inconsistency. CloudTPS is a scalable transaction manager to allow cloud database services to execute the ACID transactions of web applications, even in the presence of server failures and network partitions. We implement this approach on top of the two main families of scalable data layers: Bigtable and SimpleDB. Performance evaluation on top of HBase (an open-source version of Bigtable) in our local cluster and Amazon SimpleDB in the Amazon cloud shows that our system scales linearly at least up to 40 nodes in our local cluster and 80 nodes in the Amazon cloud
ElasTraS: An Elastic Transactional Data Store in the Cloud
Over the last couple of years, "Cloud Computing" or "Elastic Computing" has
emerged as a compelling and successful paradigm for internet scale computing.
One of the major contributing factors to this success is the elasticity of
resources. In spite of the elasticity provided by the infrastructure and the
scalable design of the applications, the elephant (or the underlying database),
which drives most of these web-based applications, is not very elastic and
scalable, and hence limits scalability. In this paper, we propose ElasTraS
which addresses this issue of scalability and elasticity of the data store in a
cloud computing environment to leverage from the elastic nature of the
underlying infrastructure, while providing scalable transactional data access.
This paper aims at providing the design of a system in progress, highlighting
the major design choices, analyzing the different guarantees provided by the
system, and identifying several important challenges for the research community
striving for computing in the cloud.Comment: 5 Pages, In Proc. of USENIX HotCloud 200
Transaction Processing over Geo-Partitioned Data
Databases are a fundamental component of any web service, storing and managing all the
service data. In large-scale web services, it is essential that the data storage systems used
consider techniques such as partial replication, geo-replication, and weaker consistency
models so that the expectations of these systems regarding availability and latency can
be met as best as possible.
In this dissertation, we address the problem of executing transactions on data that is
partially replicated. In this sense, we adopt the transactional causal consistency semantics,
the consistency model where a transaction accesses a causally consistent snapshot of the
database. However, implementing this consistency model in a partially replicated setting
raises several challenges regarding handling transactions that access data items replicated
in different nodes.
Our work aims to design and implement a novel algorithm for executing transactions
over geo-partitioned data with transactional causal consistency semantics. We discuss
the problems and design choices for executing transactions over partially replicated data
and present a design to implement the proposed algorithm by extending a weakly consistent
geo-replicated key-value store with partial replication, adding support for executing
transactions involving geo-partitioned data items. In this context, we also addressed the
problem of deciding the best strategy for searching data in replicas that hold only a part
of the total data of a service and where the state of each replica might diverge.
We evaluate our solution using microbenchmarks based on the TPC-H database. Our
results show that the overhead of the system is low for the expected scenario of a low
ratio of remote transactions.As bases de dados representam um componente fundamental de qualquer serviço web,
armazenando e gerindo todos os dados do serviço. Em serviços web de grande escala, é
essencial que os sistemas de armazenamento de dados utilizados considerem técnicas
como a replicação parcial, geo-replicação e modelos de consistência mais fracos, de forma
a que as expectativas dos utilizadores desses sistemas em relação à disponibilidade e
latência possam ser atendidas da melhor forma possível.
Nesta dissertação, abordamos o problema de executar transações sobre dados que
estão parcialmente replicados. Nesse sentido, adotamos uma semântica de consistência
transacional causal, o modelo de consistência em que uma transação acede a um snapshot
causalmente consistente da base de dados. No entanto, implementar este modelo de consistência
numa configuração parcialmente replicada levanta vários desafios relativamente
à execução de transações que acedem a dados replicados em nós diferentes.
O objetivo do nosso trabalho é projetar e implementar um novo algoritmo para a
execução de transações sobre dados geo-particionados com semântica de consistência
causal transacional. Discutimos os problemas e as opções de design para a execução de
transações em dados parcialmente replicados e apresentamos um design para implementar
o algoritmo proposto, estendendo um sistema de armazenamento chave-valor
geo-replicado de consistência fraca com replicação parcial, adicionando suporte para
executar transações envolvendo dados geo-particionados. Nesse contexto, também abordamos
o problema de decidir a melhor estratégia para procurar dados em réplicas que
guardam apenas uma parte total dos dados de um serviço e onde o estado de cada réplica
pode divergir.
Avaliamos a nossa solução utilizando microbenchmarks baseados na base de dados
TPC-H. Os nossos resultados mostram que a carga adicional do sistema é baixa para o
cenário esperado de uma baixa percentagem de transações remotas
Recommended from our members
From Controlled Data-Center Environments to Open Distributed Environments: Scalable, Efficient, and Robust Systems with Extended Functionality
The past two decades have witnessed several paradigm shifts in computing environments. Starting from cloud computing which offers on-demand allocation of storage, network, compute, and memory resources, as well as other services, in a pay-as-you-go billingmodel. Ending with the rise of permissionless blockchain technology, a decentralized computing paradigm with lower trust assumptions and limitless number of participants. Unlike in the cloud, where all the computing resources are owned by some trusted cloud provider, permissionless blockchains allow computing resources owned by possibly malicious parties to join and leave their network without obtaining permission from some centralized trusted authority. Still, in the presence of malicious parties, permissionlessblockchain networks can perform general computations and make progress. Cloud computing is powered by geographically distributed data-centers controlled and managed by trusted cloud service providers and promises theoretically infinite computing resources. On the other hand, permissionless blockchains are powered by open networks of geographically distributed computing nodes owned by entities that are not necessarily known or trusted. This paradigm shift requires a reconsideration of distributed data management protocols and distributed system designs that assume low latency across system components, inelastic computing resources, or fully trusted computing resources.In this dissertation, we propose new system designs and optimizations that address scalability and efficiency of distributed data management systems in cloud environments. We also propose several protocols and new programming paradigms to extend the functionality and enhance the robustness of permissionless blockchains. The work presented spans global-scale transaction processing, large-scale stream processing, atomic transaction processing across permissionless blockchains, and extending the functionality and the use-cases of permissionless blockchains. In all these directions, the focus is on rethinking system and protocol designs to account for novel cloud and permissionless blockchain assumptions. For global-scale transaction processing, we propose GPlacer, a placement optimization framework that decides replica placement of fully and partial geo-replicated databases. For large-scale stream processing, we propose Cache-on-Track (CoT) an adaptive and elastic client-side cache that addresses server-side load-imbalances that occur in large-scale distributed storage layers. In permissionless blockchain transaction processing, we propose AC3WN, the first correct cross-chain commitment protocol that guarantees atomicity of cross-chain transactions. Also, we propose TXSC, a transactional smart contract programming framework. TXSC provides smart contract developers with transaction primitives. These primitives allow developers to write smart contracts without the need to reason about the anomalies that can arise due to concurrent smart contract function executions. In addition, we propose a forward-looking architecture that unifies both permissioned and permissionless blockchains and exploits the running infrastructure of permissionless blockchains to build global asset management systems
Robust data storage in a network of computer systems
PhD ThesisRobustness of data in this thesis is taken to mean reliable
storage of data and also high availability of data .objects in spite
of the occurrence of faults. Algorithms and data structures which
can be used to provide such robustness in the presence of various
disk, processor and communication network failures are described.
Reliable storage of data at individual nodes in a network of
computer systems is based on the use of a stable storage mechanism
combined with strategies which are used to help ensure crash resis-
tance of file operations in spite of the use of buffering mechan-
isms by operating systems. High availability of data in the net-
work is maintained by replicating data on different computers and
mutual consistency between replicas is ensured in spite of network
partitioning.
A stable storage system which provides atomicity for more complex data structures instead of the usual fixed size page has been
designed and implemented and its performance evaluated. A crash
resistant file system has also been implemented and evaluated.
Many of the techniques presented here are used in the design
of what we call CRES (Crash-resistant, Replicated and Stable)
storage. CRES storage provides fault tolerance facilities for
various disk and processor faults. It also provides fault tolerance facilities for network partitioning through the provision of an algorithm for the update and merge of a partitioned data storage
system
- …