Search CORE

2,793 research outputs found

The impossibility of boosting distributed service resilience

Author: Attie Paul
Guerraoui Rachid
Kuznetsov Petr
Lynch Nancy
Rajsbaum Sergio
Publication venue: 'Elsevier BV'
Publication date: 12/06/2015
Field of study

We study f -resilient services, which are guaranteed to operate as long as no more than f of the associated processes fail. We prove three theorems asserting the impossibility of boosting the resilience of such services. Our ﬁrst theorem allows any connection pattern between processes and services but assumes these services to be atomic (linearizable) objects. This theorem says that no distributed system in which processes coordinate using f -resilient atomic objects and reliable registers can solve the consensus problem in the presence of f + 1 undetectable process stopping failures. In contrast, we show that it is possible to boost the resilience of some systems solving problems easier than consensus: for example, the 2-set consensus problem is solvable for 2n processes and 2n − 1 failures (i.e., wait-free) using n-process consensus services resilient to n − 1 failures (wait-free). Our proof is short and self-contained. We then introduce the larger class of failure-oblivious services. These are services that cannot use information about failures, although they may behave more ﬂexibly than atomic objects. An example of such a service is totally ordered broadcast. Our second theorem generalizes the ﬁrst theorem and its proof to failure-oblivious services. Our third theorem allows the system to contain failure-aware services, such as failure de- tectors, in addition to failure-oblivious services. This theorem requires that each failure-aware service be connected to all processes; thus, f +1 process failures overall can disable all the failure- aware services. In contrast, it is possible to boost the resilience of a system solving consensus using failure-aware services if arbitrary connection patterns between processes and services are allowed: consensus is solvable for any number of failures using only 1-resilient 2-process perfect failure detectors. As far as we know, this is the ﬁrst time a uniﬁed framework has been used to describe both atomic and non-atomic objects, and the ﬁrst time boosting analysis has been performed for services more general than atomic objects

Infoscience - École polytechnique fédérale de Lausanne

ACE: Abstract Consensus Encapsulation for Liveness Boosting of State Machine Replication

Author: Malkhi Dahlia
Rinberg Arik
Spiegelman Alexander
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 24th International Conference on Principles of Distributed Systems (OPODIS 2020)
Publication date: 01/01/2021
Field of study

With the emergence of attack-prone cross-organization systems, providing asynchronous state machine replication (SMR) solutions is no longer a theoretical concern. This paper presents ACE, a framework for the design of such fault tolerant systems. Leveraging a known paradigm for randomized consensus solutions, ACE wraps existing practical solutions and real-life systems, boosting their liveness under adversarial conditions and, at the same time, promoting load balancing and fairness. Boosting is achieved without modifying the overall design or the engineering of these solutions. ACE is aimed at boosting the prevailing approach for practical fault tolerance. This approach, often named partial synchrony, is based on a leader-based paradigm: a good leader makes progress and a bad leader does no harm. The partial synchrony approach focuses on safety and forgoes liveness under targeted and dynamic attacks. Specifically, an attacker might block specific leaders, e.g., through a denial of service, to prevent progress. ACE provides boosting by running waves of parallel leaders and selecting a winning leader only retroactively, achieving boosting at a linear communication cost increase. ACE is agnostic to the fault model, inheriting it s failure model from the wrapped solution assumptions. As our evaluation shows, an asynchronous Byzantine fault tolerance (BFT) replication system built with ACE around an existing partially synchronous BFT protocol demonstrates reasonable slow-down compared with the base BFT protocol during faultless synchronous scenarios, yet exhibits significant speedup while the system is under attack

Dagstuhl Research Online Publication Server

Failure detectors as type boosters

Author: Guerraoui Rachid
Kouznetsov Petr
Publication venue
Publication date: 18/06/2018
Field of study

The power of an object type T can be measured as the maximum number n of processes that can solve consensus using only objects of T and registers. This number, denoted cons(T), is called the consensus power of T. This paper addresses the question of the weakest failure detector to solve consensus among a number k > n of processes that communicate using shared objects of a type T with consensus power n. In other words, we seek for a failure detector that is sufficient and necessary to "boost” the consensus power of a type T from n to k. It was shown in Neiger (Proceedings of the 14th annual ACM symposium on principles of distributed computing (PODC), pp. 100-109, 1995) that a certain failure detector, denoted Ω n , is sufficient to boost the power of a type T from n to k, and it was conjectured that Ω n was also necessary. In this paper, we prove this conjecture for one-shot deterministic types. We first show that, for any one-shot deterministic type T with cons(T) ≤ n, Ω n is necessary to boost the power of T from n to n+1. Then we go a step further and show that Ω n is also the weakest to boost the power of (n+1)-ported one-shot deterministic types from n to any k > n. Our result generalizes, in a precise sense, the result of the weakest failure detector to solve consensus in asynchronous message-passing systems (Chandra etal. in J ACM 43(4):685-722, 1996). As a corollary, we show that Ω t is the weakest failure detector to boost the resilience level of a distributed shared memory system, i.e., to solve consensus among n > t processes using (t − 1)-resilient objects of consensus power

RERO DOC Digital Library

Efficient Counting with Optimal Resilience

Author: Lenzen C.
Rybicki J.
Publication venue
Publication date: 01/01/2015
Field of study

In the synchronous

c

-counting problem, we are given a synchronous system of

n

nodes, where up to

f

of the nodes may be Byzantine, that is, have arbitrary faulty behaviour. The task is to have all of the correct nodes count modulo

c

in unison in a self-stabilising manner: regardless of the initial state of the system and the faulty nodes' behavior, eventually rounds are consistently labelled by a counter modulo

c

at all correct nodes. We provide a deterministic solution with resilience

f<n/3

that stabilises in

O(f)

rounds and every correct node broadcasts

O(\log^2 f)

bits per round. We build and improve on a recent result offering stabilisation time

O(f)

and communication complexity

O(\log^2 f /\log \log f)

but with sub-optimal resilience

f = n^{1-o(1)}

(PODC 2015). Our new algorithm has optimal resilience, asymptotically optimal stabilisation time, and low communication complexity. Finally, we modify the algorithm to guarantee that after stabilisation very little communication occurs. In particular, for optimal resilience and polynomial counter size

c=n^{O(1)}

, the algorithm broadcasts only

O(1)

bits per node every

\Theta(n)

rounds without affecting the other properties of the algorithm; communication-wise this is asymptotically optimal

MPG.PuRe

Synchronization using failure detectors

Author: Kouznetsov Petr
Publication venue: Lausanne, EPFL
Publication date: 18/05/2005
Field of study

Many important synchronization problems in distributed computing are impossible to solve (in a fault-tolerant manner) in purely asynchronous systems, where message transmission delays and relative processor speeds are unbounded. It is then natural to seek for the minimal synchrony assumptions that are sufficient to solve a given synchronization problem. A convenient way to describe synchrony assumptions is using the failure detector abstraction. In this thesis, we determine the weakest failure detectors for several fundamental problems in distributed computing: solving fault-tolerant mutual exclusion, solving non-blocking atomic commit, and boosting the synchronization power of atomic objects. We conclude the thesis by a perspective on the very definition of failure detectors

Infoscience - École polytechnique fédérale de Lausanne

Distributed eventual leader election in the crash-recovery and general omission failure models.

Author: Fernández Campusano Christian
Publication venue
Publication date: 24/01/2020
Field of study

102 p.Distributed applications are present in many aspects of everyday life. Banking, healthcare or transportation are examples of such applications. These applications are built on top of distributed systems. Roughly speaking, a distributed system is composed of a set of processes that collaborate among them to achieve a common goal. When building such systems, designers have to cope with several issues, such as different synchrony assumptions and failure occurrence. Distributed systems must ensure that the delivered service is trustworthy.Agreement problems compose a fundamental class of problems in distributed systems. All agreement problems follow the same pattern: all processes must agree on some common decision. Most of the agreement problems can be considered as a particular instance of the Consensus problem. Hence, they can be solved by reduction to consensus. However, a fundamental impossibility result, namely (FLP), states that in an asynchronous distributed system it is impossible to achieve consensus deterministically when at least one process may fail. A way to circumvent this obstacle is by using unreliable failure detectors. A failure detector allows to encapsulate synchrony assumptions of the system, providing (possibly incorrect) information about process failures. A particular failure detector, called Omega, has been shown to be the weakest failure detector for solving consensus with a majority of correct processes. Informally, Omega lies on providing an eventual leader election mechanism

Archivo Digital para la Docencia y la Investigación

Education for innovation and entrepreneurship in the food system: the Erasmus+ BoostEdu approach and results

Author: Barrera C.
Castello M. L.
Dalla Rosa M.
Heredia A.
Hobley T. J.
Knobl C. F.
Materia V. C.
Romanova G.
Russo S.
Segui L.
Viaggi D.
Viereck N.
Xu S. M.
Publication venue
Publication date: 01/01/2021
Field of study

Innovation and entrepreneurship are key factors to provide added value for food systems. Based on the findings of the Erasmus+ Strategic Partnership BoostEdu, the objective of this paper is to provide answers to three knowledge gaps: 1) identify the needs for innovation and entrepreneurship (I&E) in the food sector; 2) understand the best way to organize learning; 3) provide flexibility in turbulent times. BoostEdu aimed to provide a platform for continuing education within I&E for food professionals and was carried out through co-creation workshops and the development of an e-learning course. The results of the project in particular during the Covid-19 pandemics, highlighted the need for flexible access to modules that are complementary to other sources and based on a mix of theoretical concepts and practical experiences. The main lessons learned concern the need of co-creation and co-learning processes to identify suitable practices for the use of innovative digital technologies

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Automatic Machine Learning for Insurance: H2O Experiment

Author: Valle Nofuentes Samuel
Publication venue
Publication date: 01/01/2021
Field of study

Treballs Finals del Màster de Ciències Actuarials i Financeres, Facultat d'Economia i Empresa, Universitat de Barcelona, Curs: 2020-2021, Tutor: Dr. Salvador Torra PorrasThis thesis provides an introduction of machine learning (ML), shows the implication that ML has on the insurance sector and takes a special consideration to the H2O ensemble modelling approach for the insurance claim fraud detection binary classification. The aim of this thesis is to study the H2O Automatic ML potential and compare the results generated with traditional algorithms such as lineal perceptron, Logistic Regression, multilayer perceptron, support vector machine and decision tree. Using H2O web interface or R programming, not only the most efficient ML algorithms are obtained with no effort but also provide better modelling metrics than traditional methods

Diposit Digital de la Universitat de Barcelona