3,881 research outputs found
A Wait-free Multi-word Atomic (1,N) Register for Large-scale Data Sharing on Multi-core Machines
We present a multi-word atomic (1,N) register for multi-core machines
exploiting Read-Modify-Write (RMW) instructions to coordinate the writer and
the readers in a wait-free manner. Our proposal, called Anonymous Readers
Counting (ARC), enables large-scale data sharing by admitting up to
concurrent readers on off-the-shelf 64-bits machines, as opposed to the most
advanced RMW-based approach which is limited to 58 readers. Further, ARC avoids
multiple copies of the register content when accessing it---this affects
classical register's algorithms based on atomic read/write operations on single
words. Thus it allows for higher scalability with respect to the register size.
Moreover, ARC explicitly reduces improves performance via a proper limitation
of RMW instructions in case of read operations, and by supporting constant time
for read operations and amortized constant time for write operations. A proof
of correctness of our register algorithm is also provided, together with
experimental data for a comparison with literature proposals. Beyond assessing
ARC on physical platforms, we carry out as well an experimentation on
virtualized infrastructures, which shows the resilience of wait-free
synchronization as provided by ARC with respect to CPU-steal times, proper of
more modern paradigms such as cloud computing.Comment: non
On the Space Complexity of Set Agreement
The -set agreement problem is a generalization of the classical consensus
problem in which processes are permitted to output up to different input
values. In a system of processes, an -obstruction-free solution to the
problem requires termination only in executions where the number of processes
taking steps is eventually bounded by . This family of progress conditions
generalizes wait-freedom () and obstruction-freedom (). In this
paper, we prove upper and lower bounds on the number of registers required to
solve -obstruction-free -set agreement, considering both one-shot and
repeated formulations. In particular, we show that repeated set agreement
can be solved using registers and establish a nearly matching lower
bound of
High performance data processing
Dissertação de mestrado em Informatics EngeneeringÀ medida que as aplicações atingem uma maior quantidade de utilizadores, precisam de processar uma crescente quantidade de pedidos. Para além disso, precisam de muitas vezes satisfazer pedidos de utilizadores de diferentes partes do globo, onde
as latências de rede têm um impacto significativo no desempenho em instalações
monolÃticas. Portanto, distribuição é uma solução muito procurada para melhorar a
performance das camadas aplicacional e de dados. Contudo, distribuir dados não é
uma tarefa simples se pretendemos assegurar uma forte consistência. Isto leva a que
muitos sistemas de base de dados dependam de protocolos de sincronização pesados,
como two-phase commit, consenso distribuÃdo, bloqueamento distribuÃdo, entre outros,
enquanto que outros sistemas dependem em consistência fraca, não viável para alguns
casos de uso.
Esta tese apresenta o design, implementação e avaliação de duas soluções que
têm como objetivo reduzir o impacto de assegurar garantias de forte consistência
em sistemas de base de dados, especialmente aqueles distribuÃdos pelo globo. A
primeira é o Primary Semi-Primary, uma arquitetura de base de dados distribuÃda
com total replicação que permite que as réplicas evoluam independentemente, para
evitar que os clientes precisem de esperar que escritas precedentes que não geram
conflitos sejam propagadas. Apesar das réplicas poderem processar tanto leituras
como escritas, melhorando a escalabilidade, o sistema continua a oferecer garantias de
consistência forte, através do envio da certificação de transações para um nó central.
O seu design é independente de modelos de dados, mas a sua implementação pode
tirar partido do controlo de concorrência nativo oferecido por algumas base de dados,
como é mostrado na implementação usando PostgreSQL e o seu Snapshot Isolation.
Os resultados apresentam várias vantagens tanto em ambientes locais como globais. A
segunda solução são os Multi-Record Values, uma técnica que particiona dinâmicamente
valores numéricos em múltiplos registros, permitindo que escritas concorrentes possam
executar com uma baixa probabilidade de colisão, reduzindo a taxa de abortos e/ou
contenção na adquirição de locks. Garantias de limites inferiores, exigido por objetos
como saldos bancários ou inventários, são assegurados por esta estratégia, ao contrário
de muitas outras alternativas. O seu design é também indiferente do modelo de dados,
sendo que as suas vantagens podem ser encontradas em sistemas SQL e NoSQL, bem
como distribuÃdos ou centralizados, tal como apresentado na secção de avaliação.As applications reach an wider audience that ever before, they must process larger and larger amounts of requests. In addition, they often must be able to serve users all over the globe, where network latencies have a significant negative impact on
monolithic deployments. Therefore, distribution is a well sought-after solution to
improve performance of both applicational and database layers. However, distributing
data is not an easy task if we want to ensure strong consistency guarantees. This leads
many databases systems to rely on expensive synchronization controls protocols such
as two-phase commit, distributed consensus, distributed locking, among others, while
other systems rely on weak consistency, unfeasible for some use cases.
This thesis presents the design, implementation and evaluation of two solutions
aimed at reducing the impact of ensuring strong consistency guarantees on database
systems, especially geo-distributed ones. The first is the Primary Semi-Primary, a full replication distributed database architecture that allows different replicas to evolve
independently, to avoid that clients wait for preceding non-conflicting updates. Al though replicas can process both reads and writes, improving scalability, the system
still ensures strong consistency guarantees, by relaying transactions’ certifications
to a central node. Its design is independent of the underlying data model, but its
implementation can take advantage of the native concurrency control offered by some
systems, as is exemplified by an implementation using PostgreSQL and its Snapshot
Isolation. The results present several advantages in both throughput and response time,
when comparing to other alternative architectures, in both local and geo-distributed
environments. The second solution is the Multi-Record Values, a technique that dynami cally partitions numeric values into multiple records, allowing concurrent writes to
execute with low conflict probability, reducing abort rate and/or locking contention.
Lower limit guarantees, required by objects such as balances or stocks, are ensure by
this strategy, unlike many other similar alternatives. Its design is also data model
agnostic, given its advantages can be found in both SQL and NoSQL systems, as well
as both centralized and distributed database, as presented in the evaluation section
Intermediate Value Linearizability: A Quantitative Correctness Criterion
Big data processing systems often employ batched updates and data sketches to
estimate certain properties of large data. For example, a CountMin sketch
approximates the frequencies at which elements occur in a data stream, and a
batched counter counts events in batches. This paper focuses on the correctness
of concurrent implementations of such objects. Specifically, we consider
quantitative objects, whose return values are from a totally ordered domain,
with an emphasis on -bounded objects that estimate a quantity with an
error of at most with probability at least .
The de facto correctness criterion for concurrent objects is linearizability.
Under linearizability, when a read overlaps an update, it must return the
object's value either before the update or after it. Consider, for example, a
single batched increment operation that counts three new events, bumping a
batched counter's value from to . In a linearizable implementation of
the counter, an overlapping read must return one of these. We observe, however,
that in typical use cases, any intermediate value would also be acceptable. To
capture this degree of freedom, we propose Intermediate Value Linearizability
(IVL), a new correctness criterion that relaxes linearizability to allow
returning intermediate values, for instance in the example above. Roughly
speaking, IVL allows reads to return any value that is bounded between two
return values that are legal under linearizability. A key feature of IVL is
that concurrent IVL implementations of -bounded objects remain
-bounded. To illustrate the power of this result, we give a
straightforward and efficient concurrent implementation of an -bounded
CountMin sketch, which is IVL (albeit not linearizable). Finally, we show that
IVL allows for inherently cheaper implementations than linearizable ones
Extending Eventually Consistent Cloud Databases for Enforcing Numeric Invariants
Geo-replicated databases often operate under the principle of eventual
consistency to offer high-availability with low latency on a simple key/value
store abstraction. Recently, some have adopted commutative data types to
provide seamless reconciliation for special purpose data types, such as
counters. Despite this, the inability to enforce numeric invariants across all
replicas still remains a key shortcoming of relying on the limited guarantees
of eventual consistency storage. We present a new replicated data type, called
bounded counter, which adds support for numeric invariants to eventually
consistent geo-replicated databases. We describe how this can be implemented on
top of existing cloud stores without modifying them, using Riak as an example.
Our approach adapts ideas from escrow transactions to devise a solution that is
decentralized, fault-tolerant and fast. Our evaluation shows much lower latency
and better scalability than the traditional approach of using strong
consistency to enforce numeric invariants, thus alleviating the tension between
consistency and availability
- …