7 research outputs found
Timestamp-Based Approach for the Detection and Resolution of Mutual Conflicts in Distributed Systems
We present a timestamp based algorithm for the detection of both write-write and read-write conflicts for a single file in distributed systems during network partitions. Our algorithm allows operations to occur in different network partitions simultaneously. When the sites from different partitions merge, the algorithm detects and resolves both read-write and write-write conflicts without taking into account the semantics of the transactions. Once the conflicts have been detected some reconciliation steps for the resolution of conflicts have also been proposed. Our algorithm will be useful in real-time systems where timeliness of operations is more important than response time (delayed commit
Permission-based fault tolerant mutual exclusion algorithm for mobile Ad Hoc networks
This study focuses on resolving the problem of mutual exclusion in mobile ad hoc networks. A Mobile Ad Hoc Network (MANET) is a wireless network without fixed
infrastructure. Nodes are mobile and topology of MANET changes very frequently and unpredictably. Due to these limitations, conventional mutual exclusion algorithms
presented for distributed systems (DS) are not applicable for MANETs unless they attach to a mechanism for dynamic changes in their topology.
Algorithms for mutual exclusion in DS are categorized into two main classes including token-based and permission-based algorithms. Token-based algorithms depend on circulation of a specific message known as token. The owner of the token has priority for entering the critical section. Token may lose during communications, because of link failure or failure of token host. However, the processes for token-loss detection and token regeneration are very complicated and time-consuming. Token-based algorithms are generally non-fault-tolerant (although some mechanisms are utilized to increase their level of fault-tolerance) because of common problem of single token as a single point of failure. On the contrary, permission-based algorithms utilize the permission of multiple nodes to guarantee mutual exclusion. It yields to high traffic when number of nodes is high. Moreover, the number of message transmissions and energy consumption increase in MANET by increasing the number of mobile nodes accompanied in every decision making cycle.
The purpose of this study is to introduce a method of managing the critical section,named as Ancestral, having higher fault-tolerance than token-based and fewer message
transmissions and traffic rather that permission-based algorithms. This method makes a tradeoff between token-based and permission-based. It does not utilize any token, that is similar to permission-based, and the latest node having the critical section influences
the entrance of the next node to the critical section, that is similar to token-based algorithms. The algorithm based on ancestral is named as DAD algorithms and
increases the availability of fully connected network between 2.86 to 59.83% and decreases the number of message transmissions from 4j-2 to 3j messages (j as number of nodes in partition).
This method is then utilized as the basis of dynamic ancestral mutual exclusion algorithm for MANET which is named as MDA. This algorithm is presented and evaluated for different scenarios of mobility of nodes, failure, load and number of nodes. The results of study show that MDA algorithm guarantees mutual exclusion,dead lock freedom and starvation freedom. It improves the availability of CS to minimum 154.94% and 113.36% for low load and high load of CS requests respectively
compared to other permission-based lgorithm.Furthermore, it improves response time up to 90.69% for high load and 75.21% for low load of CS requests. It degrades the
number of messages from n to 2 messages in the best case and from 3n/2 to n in the worst case. MDA algorithm is resilient to transient partitioning of network that is
normally occurs due to failure of nodes or links
How Hard is Asynchronous Weight Reassignment? (Extended Version)
The performance of distributed storage systems deployed on wide-area networks
can be improved using weighted (majority) quorum systems instead of their
regular variants due to the heterogeneous performance of the nodes. A
significant limitation of weighted majority quorum systems lies in their
dependence on static weights, which are inappropriate for systems subject to
the dynamic nature of networked environments. To overcome this limitation, such
quorum systems require mechanisms for reassigning weights over time according
to the performance variations. We study the problem of node weight reassignment
in asynchronous systems with a static set of servers and static fault
threshold. We prove that solving such a problem is as hard as solving
consensus, i.e., it cannot be implemented in asynchronous failure-prone
distributed systems. This result is somewhat counter-intuitive, given the
recent results showing that two related problems -- replica set reconfiguration
and asset transfer -- can be solved in asynchronous systems. Inspired by these
problems, we present two versions of the problem that contain restrictions on
the weights of servers and the way they are reassigned. We propose a protocol
to implement one of the restricted problems in asynchronous systems. As a case
study, we construct a dynamic-weighted atomic storage based on such a protocol.
We also discuss the relationship between weight reassignment and asset transfer
problems and compare our dynamic-weighted atomic storage with reconfigurable
atomic storage.Comment: This is the extended version of a paper to appear at the 43rd IEEE
International Conference on Distributed Computing Systems (ICDCS 2023
Prinzipien der Replikationskontrolle in verteilten Datenbanksystemen?
Durch Datenreplikation können prinzipiell schnellere Zugriffszeiten und eine beliebig hohe Fehlertoleranz
in verteilten Datenbanksystemen erreicht werden.
Anderseits erhöht Replikation die Gefahr von Inkonsistenzen
und den Aufwand von Änderungsoperationen. Zur Lösung dieses Zielkonflikts wurden in der Literatur viele unterschiedliche Replikationsverfahren vorgeschlagen. Dieser Überblicksartikel beschreibt die den einzelnen Verfahren
zugrundeliegenden Prinzipien zur Replikationskontrolle. Dazu
werden die durch den Kopieneinsatz resultierenden Probleme
erläutert, daraus Kriterien zur anschließenden Klassifikation abgeleitet und danach ausgewählte Replikationsverfahren näher vorgestellt
Object replication in a distributed system
PhD ThesisA number of techniques have been proposed for the construction of fault—tolerant
applications. One of these techniques is to replicate vital system resources so that if one
copy fails sufficient copies may still remain operational to allow the application to
continue to function. Interactions with replicated resources are inherently more complex
than non—replicated interactions, and hence some form of replication transparency is
necessary. This may be achieved by employing replica consistency protocols to mask replica
failures and maintain consistency of state between functioning replicas.
To achieve consistency between replicas it is necessary to ensure that all replicas
receive the same set of messages in the same order, despite failures at the senders and
receivers. This can be accomplished by making use of order preserving reliable
communication protocols. However, we shall show how it can be more efficient to use
unordered reliable communication and to impose ordering at the application level, by
making use of syntactic knowledge of the application.
This thesis develops techniques for replicating objects: in general this is harder than
replicating data, as objects (which can contain data) can contain calls on other objects.
Handling replicated objects is essentially the same as handling replicated computations,
and presents more problems than simply replicating data. We shall use the concept of the
object to provide transparent replication to users: a user will interact with only a single
object interface which hides the fact that the object is actually replicated.
The main aspects of the replication scheme presented in this thesis have been fully
implemented and tested. This includes the design and implementation of a replicated
object invocation protocol and the algorithms which ensure that (replicated) atomic
actions can manipulate replicated objects.Research Studentship, Science and Engineering Research Council.
Esprit Project 2267 (Integrated Systems Architecture)
Resilient Threat-Adaptive Consensus
Malicious and coordinated attacks are happening increasingly often, and have targeted critical systems such as nuclear plants, public transportation systems, hospitals and governments. Because critical infrastructures must be resilient against
advanced and persistent threats, a common architecture of choice to mitigate those
hazards are distributed systems, more specifically Byzantine fault-tolerant statemachine replicated(BFT-SMR) systems. In this PhD thesis, we propose solutions
to critical challenges in the field of distributed systems, focusing on creating adaptive algorithms and protocols to strengthen the resilience state-of-the-art systems.
The first challenge is how to ensure the security and reliability of critical infrastructures against advanced and persistent attacks at various threat levels. To address
this, we present ThreatAdaptive, a novel BFT-SMR protocol that automatically
adapts to changes in the anticipated and observed threats in an unattended manner. ThreatAdaptive proactively reconfigures the system to cope with the faults
that one needs to expect given the imminent threats. It threreby avoids the limitations of traditional BFT-SMR protocols that require either by design a high
fault threshold or a trusted external reconfiguration entity. Our results show that
ThreatAdaptive meets the latency and throughput of BFT baselines while adapting
30% faster than previous methods, providing a more efficient and secure solution
for critical infrastructures. The second challenge is how to optimize the performance of a distributed system in the presence of unreliable nodes. To address this,
we propose a method for automatic reconfiguration based on a 3D virtual coordinate system (VCS) that allows correct nodes to detect and eliminate inconsistent
latencies and protect system performance against Byzantine attacks. We evaluate
our reconfiguration baseline, Geometric, on three real-world networking datasets
and show that it protects performance up to 78% better than previous solutions
and provides the closest representation of real-world connections. Our proposed
solutions provide a more reliable and secure approach to automatic reconfiguration
in distributed systems. Overall, this thesis makes a significant contribution to the
field of distributed systems by proposing novel solutions to two critical challenges:
ensuring the security and reliability of critical infrastructures and optimizing the
performance of distributed systems in the presence of unreliable nodes
Rule based replication strategy for heterogeneous, autonomous information systems
Bei der regelbasierten Replikationsstrategie RegRess erfolgt die Koordination der Schreib- und Lesezugriffe auf die Replikate mittels Replikationsregeln. Diese Regeln werden in der eigens entwickelten Regelsprache RRML formuliert, wobei fachliche und technische Anforderungen berücksichtigt werden können. Vor jedem Zugriff auf die Replikate wird eine Inferenz dieser Regeln durchgeführt, um die betroffenen Replikate zu bestimmen. Dadurch wird unterschiedlichstes Konsistenzverhalten von RegRess realisiert, insbesondere werden temporäre Inkonsistenzen toleriert. Eine Regelmenge mit für einen Anwendungsfall spezifizierten Regeln bildet die Konfiguration von RegRess. Weil in den Regeln Systemzustände berücksichtigt werden können, kann zur Laufzeit das Verhalten angepasst werden. Somit handelt es sich bei RegRess um eine konfigurierbare, adaptive Replikationsstrategie. Zur Realisierung von RegRess dient der Replikationsmanager KARMA, der einen Regelinterpreter für die RRML beinhaltet.At the rule based replication strategy RegRess the coordination of the write and read accesses is carried out on the replicas by means of replication rules. These rules are formulated in the specifically developed rule language RRML, in which functional and technical requirements can be taken into account. An inference of these rules is carried out in front of every access to the replicas to determine the replicas concerned. The most different consistency behaviour is realized by recourse through this, temporary inconsistencies particularly are tolerated. An amount of rule with rules specified for an application case forms the configuration of RegRess. Because in the rules system states can be taken into account, the behaviour can be adapted to the running time. Therefore RegRess is a configurable, adaptive replication strategy. The replication manager KARMA who contains a rule interpreter for the RRML serves for the realization of RegRess