2,350 research outputs found
Eventual Consistency: Origin and Support
Eventual consistency is demanded nowadays in geo-replicated services that need to be highly scalable and available. According to the CAP constraints, when network partitions may arise, a distributed service should choose between being strongly consistent or being highly available. Since scalable services should be available, a relaxed consistency (while the network is partitioned) is the preferred choice. Eventual consistency is not a common data-centric consistency model, but only a state convergence condition to be added to a relaxed consistency model. There are still several aspects of eventual consistency that have not been analysed in depth in previous works: 1. which are the oldest replication proposals providing eventual consistency, 2. which replica consistency models provide the best basis for building eventually consistent services, 3. which mechanisms should be considered for implementing an eventually consistent service, and 4. which are the best combinations of those mechanisms for achieving different concrete goals. This paper provides some notes on these important topics
High performance data processing
Dissertação de mestrado em Informatics EngeneeringÀ medida que as aplicações atingem uma maior quantidade de utilizadores, precisam de processar uma crescente quantidade de pedidos. Para além disso, precisam de muitas vezes satisfazer pedidos de utilizadores de diferentes partes do globo, onde
as latências de rede têm um impacto significativo no desempenho em instalações
monolíticas. Portanto, distribuição é uma solução muito procurada para melhorar a
performance das camadas aplicacional e de dados. Contudo, distribuir dados não é
uma tarefa simples se pretendemos assegurar uma forte consistência. Isto leva a que
muitos sistemas de base de dados dependam de protocolos de sincronização pesados,
como two-phase commit, consenso distribuído, bloqueamento distribuído, entre outros,
enquanto que outros sistemas dependem em consistência fraca, não viável para alguns
casos de uso.
Esta tese apresenta o design, implementação e avaliação de duas soluções que
têm como objetivo reduzir o impacto de assegurar garantias de forte consistência
em sistemas de base de dados, especialmente aqueles distribuídos pelo globo. A
primeira é o Primary Semi-Primary, uma arquitetura de base de dados distribuída
com total replicação que permite que as réplicas evoluam independentemente, para
evitar que os clientes precisem de esperar que escritas precedentes que não geram
conflitos sejam propagadas. Apesar das réplicas poderem processar tanto leituras
como escritas, melhorando a escalabilidade, o sistema continua a oferecer garantias de
consistência forte, através do envio da certificação de transações para um nó central.
O seu design é independente de modelos de dados, mas a sua implementação pode
tirar partido do controlo de concorrência nativo oferecido por algumas base de dados,
como é mostrado na implementação usando PostgreSQL e o seu Snapshot Isolation.
Os resultados apresentam várias vantagens tanto em ambientes locais como globais. A
segunda solução são os Multi-Record Values, uma técnica que particiona dinâmicamente
valores numéricos em múltiplos registros, permitindo que escritas concorrentes possam
executar com uma baixa probabilidade de colisão, reduzindo a taxa de abortos e/ou
contenção na adquirição de locks. Garantias de limites inferiores, exigido por objetos
como saldos bancários ou inventários, são assegurados por esta estratégia, ao contrário
de muitas outras alternativas. O seu design é também indiferente do modelo de dados,
sendo que as suas vantagens podem ser encontradas em sistemas SQL e NoSQL, bem
como distribuídos ou centralizados, tal como apresentado na secção de avaliação.As applications reach an wider audience that ever before, they must process larger and larger amounts of requests. In addition, they often must be able to serve users all over the globe, where network latencies have a significant negative impact on
monolithic deployments. Therefore, distribution is a well sought-after solution to
improve performance of both applicational and database layers. However, distributing
data is not an easy task if we want to ensure strong consistency guarantees. This leads
many databases systems to rely on expensive synchronization controls protocols such
as two-phase commit, distributed consensus, distributed locking, among others, while
other systems rely on weak consistency, unfeasible for some use cases.
This thesis presents the design, implementation and evaluation of two solutions
aimed at reducing the impact of ensuring strong consistency guarantees on database
systems, especially geo-distributed ones. The first is the Primary Semi-Primary, a full replication distributed database architecture that allows different replicas to evolve
independently, to avoid that clients wait for preceding non-conflicting updates. Al though replicas can process both reads and writes, improving scalability, the system
still ensures strong consistency guarantees, by relaying transactions’ certifications
to a central node. Its design is independent of the underlying data model, but its
implementation can take advantage of the native concurrency control offered by some
systems, as is exemplified by an implementation using PostgreSQL and its Snapshot
Isolation. The results present several advantages in both throughput and response time,
when comparing to other alternative architectures, in both local and geo-distributed
environments. The second solution is the Multi-Record Values, a technique that dynami cally partitions numeric values into multiple records, allowing concurrent writes to
execute with low conflict probability, reducing abort rate and/or locking contention.
Lower limit guarantees, required by objects such as balances or stocks, are ensure by
this strategy, unlike many other similar alternatives. Its design is also data model
agnostic, given its advantages can be found in both SQL and NoSQL systems, as well
as both centralized and distributed database, as presented in the evaluation section
Robust data storage in a network of computer systems
PhD ThesisRobustness of data in this thesis is taken to mean reliable
storage of data and also high availability of data .objects in spite
of the occurrence of faults. Algorithms and data structures which
can be used to provide such robustness in the presence of various
disk, processor and communication network failures are described.
Reliable storage of data at individual nodes in a network of
computer systems is based on the use of a stable storage mechanism
combined with strategies which are used to help ensure crash resis-
tance of file operations in spite of the use of buffering mechan-
isms by operating systems. High availability of data in the net-
work is maintained by replicating data on different computers and
mutual consistency between replicas is ensured in spite of network
partitioning.
A stable storage system which provides atomicity for more complex data structures instead of the usual fixed size page has been
designed and implemented and its performance evaluated. A crash
resistant file system has also been implemented and evaluated.
Many of the techniques presented here are used in the design
of what we call CRES (Crash-resistant, Replicated and Stable)
storage. CRES storage provides fault tolerance facilities for
various disk and processor faults. It also provides fault tolerance facilities for network partitioning through the provision of an algorithm for the update and merge of a partitioned data storage
system
Recommended from our members
Fault tolerance via diversity for off-the-shelf products: A study with SQL database servers
If an off-the-shelf software product exhibits poor dependability due to design faults, then software fault tolerance is often the only way available to users and system integrators to alleviate the problem. Thanks to low acquisition costs, even using multiple versions of software in a parallel architecture, which is a scheme formerly reserved for few and highly critical applications, may become viable for many applications. We have studied the potential dependability gains from these solutions for off-the-shelf database servers. We based the study on the bug reports available for four off-the-shelf SQL servers plus later releases of two of them. We found that many of these faults cause systematic noncrash failures, which is a category ignored by most studies and standard implementations of fault tolerance for databases. Our observations suggest that diverse redundancy would be effective for tolerating design faults in this category of products. Only in very few cases would demands that triggered a bug in one server cause failures in another one, and there were no coincident failures in more than two of the servers. Use of different releases of the same product would also tolerate a significant fraction of the faults. We report our results and discuss their implications, the architectural options available for exploiting them, and the difficulties that they may present
Optimistic replication
Data replication is a key technology in distributed data sharing systems, enabling higher availability and performance. This paper surveys optimistic replication algorithms that allow replica contents to diverge in the short term, in order to support concurrent work practices and to tolerate failures in low-quality communication links. The importance of such techniques is increasing as collaboration through wide-area and mobile networks becomes popular. Optimistic replication techniques are different from traditional “pessimistic ” ones. Instead of synchronous replica coordination, an optimistic algorithm propagates changes in the background, discovers conflicts after they happen and reaches agreement on the final contents incrementally. We explore the solution space for optimistic replication algorithms. This paper identifies key challenges facing optimistic replication systems — ordering operations, detecting and resolving conflicts, propagating changes efficiently, and bounding replica divergence — and provides a comprehensive survey of techniques developed for addressing these challenges
- …