3,531 research outputs found
SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine
Client-side logic and storage are increasingly used in web and mobile
applications to improve response time and availability. Current approaches tend
to be ad-hoc and poorly integrated with the server-side logic. We present a
principled approach to integrate client- and server-side storage. We support
mergeable and strongly consistent transactions that target either client or
server replicas and provide access to causally-consistent snapshots
efficiently. In the presence of infrastructure faults, a client-assisted
failover solution allows client execution to resume immediately and seamlessly
access consistent snapshots without waiting. We implement this approach in
SwiftCloud, the first transactional system to bring geo-replication all the way
to the client machine. Example applications show that our programming model is
useful across a range of application areas. Our experimental evaluation shows
that SwiftCloud provides better fault tolerance and at the same time can
improve both latency and throughput by up to an order of magnitude, compared to
classical geo-replication techniques
Benchmarking Eventually Consistent Distributed Storage Systems
Cloud storage services and NoSQL systems typically offer only "Eventual Consistency", a rather weak guarantee covering a broad range of potential data consistency behavior. The degree of actual (in-)consistency, however, is unknown. This work presents novel solutions for determining the degree of (in-)consistency via simulation and benchmarking, as well as the necessary means to resolve inconsistencies leveraging this information
New Production System for Finnish Meteorological Institute
This thesis presents the plans for replacing the production system of Finnish Meteorological Institute (FMI). It begins with a review of the state of the art in distributed systems research, and ends with a design for the replacement production system that is reliable, scalable, and maintainable.
The subject production system is a framework for managing the production of different weather predictions and models. We use this framework to abstract away the actual execution of work from its description. This way the different production processes become easily monitored and configured through the production system.
Since the amount of data processed by this system is too much for a single computer to handle, we have distributed the production system. Thus we are not dealing with just a framework for production but with a distributed system and hence a solid understanding of distributed systems theory is required in order to replace this production system.
The first part of this thesis lays the groundwork for replacing the distributed production system: a review of the state of the art in distributed systems research. It is a concise document of its own which presents the essentials of distributed systems in a clear manner. This part can be used separately from the rest of this thesis as a short introduction to distributed systems.
Second part of this thesis presents the subject production system, the need for its replacement, and our design for the new production system that is maintainable, performant, available, reliable, and scalable. We go even further than simply giving a design for this replacement production system, and instead present a practical plan to implement the new production system with Kubernetes, Brigade, and Riak CS
Multi-value distributed key-value stores
Tese de Doutoramento em InformaticsMany large scale distributed data stores rely on optimistic replication to
scale and remain highly available in the face of network partitions. Managing
data without strong coordination results in eventually consistent
data stores that allow for concurrent data updates. To allow writing applications
in the absence of linearizability or transactions, the seminal
Dynamo data store proposed a multi-value API in which a get returns
the set of concurrent written values. In this scenario, it is important to
be able to accurately and efficiently identify updates executed concurrently.
Logical clocks are often used to track data causality, necessary
to distinguish concurrent from causally related writes on the same key.
However, in traditional mechanisms there is a non-negligible metadata
overhead per key, which also keeps growing with time, proportional to
the node churn rate. Another challenge is deleting keys while respecting
causality: while the values can be deleted, per-key metadata cannot
be permanently removed in current data stores.
These systems often use anti-entropy mechanisms (like Merkle Trees)
to detect and repair divergent data versions across nodes. However,
in practice hash-based data structures are not suitable to a store using
consistent hashing and create too many false positives.
Also, highly available systems usually provide eventual consistency,
which is the weakest form of consistency. This results in a programming
model difficult to use and to reason about. It has been proved that
causal consistency is the strongest consistency model achievable if we
want highly available services. It provides better programming semantics
such as sessions guarantees. However, classical causal consistency
is a memory model that that is problematic for concurrent updates, in
the absence of concurrency control primitives. Used in eventually consistent
data stores, it leads to arbitrating between concurrent updates
which leads to data loss. We propose three novel techniques in this thesis. The first is Dotted
Version Vectors: a solution that combines a new logical clock mechanism
and a request handling workflow that together support the traditional
Dynamo key-value store API while capturing causality in an
accurate and scalable way, avoiding false conflicts. It maintains concise
information per version, linear only on the number of replicas, and includes
a container data structure that allows sets of concurrent versions
to be merged efficiently, with time complexity linear on the number of
replicas plus versions.
The second is DottedDB: a Dynamo-like key-value store, which uses
a novel node-wide logical clock framework, overcoming three fundamental
limitations of the state of the art: (1) minimize the metadata per
key necessary to track causality, avoiding its growth even in the face
of node churn; (2) correctly and durably delete keys, with no need for
tombstones; (3) offer a lightweight anti-entropy mechanism to converge
replicated data, avoiding the need for Merkle Trees.
The third and final contribution is Causal Multi-Value Consistency: a
novel consistency model that respects the causality of client operations
while properly supporting concurrent updates without arbitration, by
having the same Dynamo-like multi-value nature. In addition, we extend
this model to provide the same semantics with read and write
transactions. For both models, we define an efficient implementation
on top of a distributed key-value store.Várias bases de dados de larga escala usam técnicas de replicação otimista
para escalar e permanecer altamente disponíveis face a falhas e partições
na rede. Gerir os dados sem coordenação forte entre os nós
do servidor e o cliente resulta em bases de dados "inevitavelmente coerentes"
que permitem escritas de dados concorrentes. Para permitir
que aplicações escrevam na base de dados na ausência de transações
e mecanismos de coerência forte, a influente base de dados Dynamo
propôs uma interface multi-valor, que permite a uma leitura devolver
um conjunto de valores escritos concorrentemente para a mesma chave.
Neste cenário, é importante identificar com exatidão e eficiência quais
as escritas efetuadas numa chave de forma potencialmente concorrente.
Relógios lógicos são normalmente usados para gerir a causalidade das
chaves, de forma a detetar escritas causalmente concorrentes na mesma
chave. No entanto, mecanismos tradicionais adicionam metadados cujo
tamanho cresce proporcionalmente com a entrada e saída de nós no
servidor. Outro desafio é a remoção de chaves do sistema, respeitando
a causalidade e ao mesmo tempo não deixando metadados permanentes
no servidor.
Estes sistemas de dados utilizam também mecanismos de anti-entropia
(tais como Merkle Trees) para detetar e reparar dados replicados em diferentes
nós que divirjam. No entanto, na prática estas estruturas de dados
baseadas em hashes não são adequados para sistemas que usem hashing
consistente para a partição de dados e resultam em muitos falsos positivos.
Outro aspeto destes sistemas é o facto de normalmente apenas suportarem
coerência inevitável, que é a garantia mais fraca em termos
de coerência de dados. Isto resulta num modelo de programação difícil
de usar e compreender. Foi provado que coerência causal é a forma
mais forte de coerência de dados que se consegue fornecer, de forma a que se consiga também ser altamente disponível face a falhas. Este
modelo fornece uma semântica mais interessante ao cliente do sistema,
nomeadamente as garantias de sessão. No entanto, a coerência causal
tradicional é definida sobre um modelo de memória não apropriado
para escritas concorrentes não controladas. Isto leva a que se arbitre
um vencedor quando escritas acontecem concorrentemente, levando a
perda de dados.
Propomos nesta tese três novas técnicas. A primeira chama-se Dotted
Version Vectors: uma solução que combina um novo mecanismo de
relógios lógicos com uma interação entre o cliente e o servidor, que permitem
fornecer uma interface multi-valor ao cliente similar ao Dynamo
de forma eficiente e escalável, sem falsos conflitos. O novo relógio lógico
mantém informação precisa por versão de uma chave, de tamanho linear
no número de réplicas da chave no sistema. Permite também que
versão diferentes sejam corretamente e eficientemente reunidas.
A segunda contribuição chama-se DottedDB: uma base de dados similar
ao Dynamo, mas que implementa um novo mecanismo de relógios
lógicos ao nível dos nós, que resolve três limitações fundamentais do estado
da arte: (1) minimiza os metadados necessários manter por chave
para gerir a causalidade, evitando o seu crescimento com a entrada e
saída de nós; (2) permite remover chaves de forma permanente, sem
a necessidade de manter metadados indefinidamente no servidor; (3)
um novo protocolo de anti-entropia para reparar dados replicados, de
modo a que todas as réplicas na base de dados convirjam, sem que seja
necessário operações dispendiosas como as usadas com Merkle Trees.
A terceira e última contribuição é Coerência Causal Multi-Valor: um
novo modelo de coerência de dados que respeita a causalidade das operações
efetuadas pelos clientes e que também suporta operações concorrentes,
sem que seja necessário arbitrar um vencedor entre as escritas,
seguindo o espírito da interface multi-valor do Dynamo. Adicionalmente,
estendemos este modelo para fornecer transações de escritas ou
leituras, respeitando a mesma semântica da causalidade. Para ambos
os modelos, definimos uma implementação eficiente em cima de uma
base de dados distribuída.Fundação para a Ciência e Tecnologia (FCT) - with the research grant SFRH/BD/86735/201
Écriture rapide, lecture dans le passé : cohérence causale pour les applications localisées côté client
Client-side apps (e.g., mobile or in-browser) need cloud data to be available in a cache, for both reads and updates. The cache should use resources sparingly, be consistent and fault-tolerant. The system needs to scale to high numbers of unreliable and resource-poor clients, and large database. The SwiftCloud distributed object database is the first to provide fast reads and writes via a causally-consistent client-side local cache. It is thrifty in resources and scales well, thanks to consistent versioning provided by the cloud, using small and bounded metadata. It remains available during faults, switching to a different data centre when the current one is not responsive, while maintaining its consistency guarantees. This paper presents the SwiftCloud algorithms, design, and experimental evaluation. It shows that client-side apps enjoy the high performance and availability, under the same guarantees as a remote cloud data store, at a small cost.Les applications localisées côté client, par exemple les applications mobiles ou les applications dans le navigateur, ont besoin que les données du nuage soient disponibles dans un cache local, aussi bien pour l'écriture que pour la lecture. Ce cache doit utiliser les ressources avec économie, être cohérent, et tolérer les fautes. Le système doit passer à l'échelle d'un nombre élevé de clients non fiables et disposant de peu de ressources, et d'une base de données de grande capacité. La base de données répartie SwiftCloud est la première qui permet les lectures et les écritures rapides et cohérentes par l'intermédiaire d'un cache situé côté client. Elle utilise les ressources avec parcimonie, et passe bien à l'échelle, grâce au fait que le nuage sert des versions cohérentes tout en ne nécessitant que des métadonnées de taille petite et bornée. Les données restent disponibles pendant les fautes de réseau, car, quand le centre de calcul courant ne répond pas, le cache peut passer à un nouveau centre de calcul sans violer ses garanties de cohérence. Le présent papier présente les algorithmes, la conception et l'évaluation expérimentale de SwiftCloud. Il montre que le système fournit aux applications côté client, à la fois des performances et une disponibilité élevées, et à la fois les mêmes garanties qu'un stockage distant dans le nuage, à un faible coût
- …