41,427 research outputs found
Fisheye Consistency: Keeping Data in Synch in a Georeplicated World
Over the last thirty years, numerous consistency conditions for replicated
data have been proposed and implemented. Popular examples of such conditions
include linearizability (or atomicity), sequential consistency, causal
consistency, and eventual consistency. These consistency conditions are usually
defined independently from the computing entities (nodes) that manipulate the
replicated data; i.e., they do not take into account how computing entities
might be linked to one another, or geographically distributed. To address this
lack, as a first contribution, this paper introduces the notion of proximity
graph between computing nodes. If two nodes are connected in this graph, their
operations must satisfy a strong consistency condition, while the operations
invoked by other nodes are allowed to satisfy a weaker condition. The second
contribution is the use of such a graph to provide a generic approach to the
hybridization of data consistency conditions into the same system. We
illustrate this approach on sequential consistency and causal consistency, and
present a model in which all data operations are causally consistent, while
operations by neighboring processes in the proximity graph are sequentially
consistent. The third contribution of the paper is the design and the proof of
a distributed algorithm based on this proximity graph, which combines
sequential consistency and causal consistency (the resulting condition is
called fisheye consistency). In doing so the paper not only extends the domain
of consistency conditions, but provides a generic provably correct solution of
direct relevance to modern georeplicated systems
Rigorous Design of Fault-Tolerant Transactions for Replicated Database Systems using Event B
System availability is improved by the replication of data objects in a distributed database system. However, during updates, the complexity of keeping replicas identical arises due to failures of sites and race conditions among conflicting transactions. Fault tolerance and reliability are key issues to be addressed in the design and architecture of these systems. Event B is a formal technique which provides a framework for developing mathematical models of distributed systems by rigorous description of the problem, gradually introducing solutions in refinement steps, and verification of solutions by discharge of proof obligations. In this paper, we present a formal development of a distributed system using Event B that ensures atomic commitment of distributed transactions consisting of communicating transaction components at participating sites. This formal approach carries the development of the system from an initial abstract specification of transactional updates on a one copy database to a detailed design containing replicated databases in refinement. Through refinement we verify that the design of the replicated database confirms to the one copy database abstraction
Distributed Machine Learning via Sufficient Factor Broadcasting
Matrix-parametrized models, including multiclass logistic regression and
sparse coding, are used in machine learning (ML) applications ranging from
computer vision to computational biology. When these models are applied to
large-scale ML problems starting at millions of samples and tens of thousands
of classes, their parameter matrix can grow at an unexpected rate, resulting in
high parameter synchronization costs that greatly slow down distributed
learning. To address this issue, we propose a Sufficient Factor Broadcasting
(SFB) computation model for efficient distributed learning of a large family of
matrix-parameterized models, which share the following property: the parameter
update computed on each data sample is a rank-1 matrix, i.e., the outer product
of two "sufficient factors" (SFs). By broadcasting the SFs among worker
machines and reconstructing the update matrices locally at each worker, SFB
improves communication efficiency --- communication costs are linear in the
parameter matrix's dimensions, rather than quadratic --- without affecting
computational correctness. We present a theoretical convergence analysis of
SFB, and empirically corroborate its efficiency on four different
matrix-parametrized ML models
- …