59 research outputs found

    Scalability approaches for causal multicast: a survey

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/s00607-015-0479-0Many distributed services need to be scalable: internet search, electronic commerce, e-government... In order to achieve scalability, high availability and fault tolerance, such applications rely on replicated components. Because of the dynamics of growth and volatility of customer markets, applications need to be hosted by adaptive, highly scalable systems. In particular, the scalability of the reliable multicast mechanisms used for supporting the consistency of replicas is of crucial importance. Reliable multicast might propagate updates in a pre-determined order (e.g., FIFO, total or causal). Since total order needs more communication rounds than causal order, the latter appears to be the preferable candidate for achieving multicast scalability, although the consistency guarantees based on causal order are weaker than those of total order. This paper provides a historical survey of different scalability approaches for reliable causal multicast protocols.This work was supported by European Regional Development Fund (FEDER) and Ministerio de Economia y Competitividad (MINECO) under research Grant TIN2012-37719-C03-01.Juan MarĂ­n, RD.; Decker, H.; ArmendĂĄriz ĂĂ±igo, JE.; Bernabeu AubĂĄn, JM.; Muñoz EscoĂ­, FD. (2016). Scalability approaches for causal multicast: a survey. Computing. 98(9):923-947. https://doi.org/10.1007/s00607-015-0479-0S923947989Adly N, Nagi M (1995) Maintaining causal order in large scale distributed systems using a logical hierarchy. In: IASTED Intnl Conf on Appl Inform, pp 214–219Aguilera MK, Chen W, Toueg S (1997) Heartbeat: a timeout-free failure detector for quiescent reliable communication. In: 11th Intnl Wshop on Distrib Alg (WDAG), SaarbrĂŒcken, pp 126–140Almeida JB, Almeida PS, Baquero C (2004) Bounded version vectors. In: 18th Intnl Conf Distrib Comput (DISC), Amsterdam, pp 102–116Almeida PS, Baquero C, Fonte V (2008) Interval tree clocks. In: 12th Intnl Conf Distrib Syst (OPODIS), Luxor, pp 259–274Almeida S, LeitĂŁo J, Rodrigues LET (2013) ChainReaction: a causal+ consistent datastore based on chain replication. In: 8th EuroSys Conf, Czech Republic, pp 85–98Álvarez A, ArĂ©valo S, Cholvi V, FernĂĄndez A, JimĂ©nez E (2008) On the interconnection of message passing systems. Inf Process Lett 105(6):249–254Amir Y, Stanton J (1998) The Spread wide area group communication system. Tech. rep., CDNS-98-4, The Center for Networking and Distributed Systems, The Johns Hopkins UnivAmir Y, Dolev D, Kramer S, Malki D (1992) Transis: a communication subsystem for high availability. In: 22nd Intnl Symp Fault-Tolerant Comp (FTCS), Boston, pp 76–84Anastasi G, Bartoli A, Spadoni F (2001) A reliable multicast protocol for distributed mobile systems: design and evaluation. IEEE Trans Parallel Distrib Syst 12(10):1009–1022Bailis P, Ghodsi A, Hellerstein JM, Stoica I (2013) Bolt-on causal consistency. In: Intnl Conf Mgmnt Data (SIGMOD), New York, pp 761–772Baldoni R, Raynal M, Prakash R, Singhal M (1996) Broadcast with time and causality constraints for multimedia applications. In: 22nd Intnl Euromicro Conf, Prague, pp 617–624Baldoni R, Friedman R, van Renesse R (1997) The hierarchical daisy architecture for causal delivery. In: 17th Intnl Conf Distrib Comput Syst (ICDCS), Maryland, pp 570–577Ban B (2002) JGroups—a toolkit for reliable multicast communication. http://www.jgroups.orgBaquero C, Almeida PS, Shoker A (2014) Making operation-based CRDTs operation-based. In: 14th Intnl Conf Distrib Appl Interop Syst (DAIS), Berlin, pp 126–140Benslimane A, Abouaissa A (2002) Dynamical grouping model for distributed real time causal ordering. Comput Commun 25:288–302Birman KP, Joseph TA (1987) Reliable communication in the presence of failures. ACM Trans Comput Syst 5(1):47–76Birman KP, Schiper A, Stephenson P (1991) Lightweigt causal and atomic group multicast. ACM Trans Comput Syst 9(3):272–314Cachin C, Guerraoui R, Rodrigues LET (2011) Introduction to reliable and secure distributed programming, 2nd edn. Springer, BerlinChandra P, Gambhire P, Kshemkalyani AD (2004) Performance of the optimal causal multicast algorithm: a statistical analysis. IEEE Trans Parall Distr 15(1):40–52Chandra TD, Toueg S (1996) Unreliable failure detectors for reliable distributed systems. J ACM 43(2):225–267de Juan-MarĂ­n R, Cholvi V, JimĂ©nez E, Muñoz-EscoĂ­ FD (2009) Parallel interconnection of broadcast systems with multiple FIFO channels. In: 11th Intnl Symp on Distrib Obj, Middleware and Appl (DOA), Vilamoura, LNCS, vol 5870, pp 449–466DĂ©fago X, Schiper A, UrbĂĄn P (2004) Total order broadcast and multicast algorithms: taxonomy and survey. ACM Comput Surv 36(4):372–421Demers AJ, Greene DH, Hauser C, Irish W, Larson J, Shenker S, Sturgis HE, Swinehart DC, Terry DB (1987) Epidemic algorithms for replicated database maintenance. In: 6th ACM Symp on Princ of Distrib Comput (PODC), Canada, pp 1–12Du J, Elnikety S, Roy A, Zwaenepoel W (2013) Orbe: scalable causal consistency using dependency matrices and physical clocks. In: ACM Symp on Cloud Comput (SoCC), Santa Clara, pp 11:1–11:14FernĂĄndez A, JimĂ©nez E, Cholvi V (2000) On the interconnection of causal memory systems. In: 19th Annual ACM Symp on Princ of Distrib Comput (PODC), Portland, pp 163–170Fidge CJ (1988) Timestamps in message-passing systems that preserve the partial ordering. In: 11th Australian Comput Conf, pp 56–66Friedman R, Vitenberg R, Chockler G (2003) On the composability of consistency conditions. Inf Process Lett 86(4):169–176Gilbert S, Lynch N (2002) Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33(2):51–59Gray J, Helland P, O’Neil PE, Shasha D (1996) The dangers of replication and a solution. In: SIGMOD Conf, pp 173–182Hadzilacos V, Toueg S (1993) Fault-tolerant broadcasts and related problems. In: Mullender S (ed) Distributed systems, chap 5, 2nd edn. ACM Press, pp 97–145Johnson S, Jahanian F, Shah J (1999) The inter-group router approach to scalable group composition. In: 19th Intnl Conf on Distrib Comput Syst (ICDCS), Austin, pp 4–14Kalantar MH, Birman KP (1999) Causally ordered multicast: the conservative approach. In: 19th Intnl Conf on Distrib Comput Syst (ICDCS), Austin, pp 36–44Kawanami S, Enokido T, Takizawa M (2004) A group communication protocol for scalable causal ordering. In: 18th Intnl Conf on Adv Inform Netw Appl (AINA), Fukuoka, pp 296–302Kawanami S, Nishimura T, Enokido T, Takizawa M (2005) A scalable group communication protocol with global clock. In: 19th Intnl Conf on Adv Inform Netw Appl (AINA), Taipei, pp 625–630Kshemkalyani AD, Singhal M (1998) Necessary and sufficient conditions on information for causal message ordering and their optimal implementation. Distrib Comput 11(2):91–111Kshemkalyani AD, Singhal M (2011) Distributed computing: principles, algorithms, and systems, 2nd edn. Cambridge University Press, New YorkLadin R, Liskov B, Shrira L, Ghemawat S (1992) Providing high availability using lazy replication. ACM Trans Comput Syst 10(4):360–391Lamport L (1978) Time, clocks, and the ordering of events in a distributed system. Commun ACM 21(7):558–565Laumay P, Bruneton E, de Palma N, Krakowiak S (2001) Preserving causality in a scalable message-oriented middleware. In: Intnl Conf on Distrib Syst Platf (Middleware), pp 311–328Liu N, Liu M, Cao J, Chen G, Lou W (2010) When transportation meets communication: V2P over VANETs. In: 30th Intnl Conf Distrib Comput Syst (ICDCS), GenovaLwin CH, Mohanty H, Ghosh RK (2004) Causal ordering in event notification service systems for mobile users. In: Intnl Conf Inform Tech: Coding Comput (ITCC), Las Vegas, pp 735–740Mahajan P, Alvisi L, Dahlin M (2011) Consistency, availability and covergence. Tech. rep., UTCS TR-11-22, The University of Texas at AustinMatos M, Sousa A, Pereira J, Oliveira R, Deliot E, Murray P (2009) CLON: overlay networks and gossip protocols for cloud environments. In: 11th Intnl Symp on Dist Obj, Middleware and Appl (DOA), Vilamoura, LNCS, vol 5870, pp 549–566Mattern F (1989) Virtual time and global states of distributed systems. In: Parallel and distributed algorithms, North-Holland, pp 215–226Mattern F, FĂŒnfrocken S (1994) A non-blocking lightweight implementation of causal order message delivery. Lect Notes Comput Sci 938:197–213Meldal S, Sankar S, Vera J (1991) Exploiting locality in maintaining potential causality. In: 10th ACM Symp on Princ of Distrib Comp (PODC), Montreal, pp 231–239Meling H, Montresor A, Helvik BE, Babaoglu Ö (2008) Jgroup/ARM: a distributed object group platform with autonomous replication management. Softw Pract Exp 38(9):885–923Mosberger D (1993) Memory consistency models. Oper Syst Rev 27(1):18–26MostĂ©faoui A, Raynal M (1993) Causal multicast in overlapping groups: towards a low cost approach. In: 4th Intnl Wshop on Future Trends of Distrib Comp Syst (FTDCS), Lisbon, pp 136–142MostĂ©faoui A, Raynal M, Travers C, Patterson S, Agrawal D, El Abbadi A (2005) From static distributed systems to dynamic systems. In: 24th Symp on Rel Distrib Syst (SRDS), Orlando, pp 109–118Nishimura T, Hayashibara N, Takizawa M, Enokido T (2005) Causally ordered delivery with global clock in hierarchical group. In: ICPADS (2), Fukuoka, pp 560–564Parker DS Jr, Popek GJ, Rudisin G, Stoughton A, Walker BJ, Walton E, Chow JM, Edwards DA, Kiser S, Kline CS (1983) Detection of mutual inconsistency in distributed systems. IEEE Trans Softw Eng 9(3):240–247Pascual-Miret L (2014) Consistency models in modern distributed systems. An approach to eventual consistency. Master’s thesis, Depto. de Sistemas InformĂĄticos y ComputaciĂłn, Univ. PolitĂšcnica de ValĂšnciaPascual-Miret L, GonzĂĄlez de MendĂ­vil JR, BernabĂ©u-AubĂĄn JM, Muñoz-EscoĂ­ FD (2015) Widening CAP consistency. Tech. rep., IUMTI-SIDI-2015/003, Univ. PolitĂšcnica de ValĂšncia, ValenciaPeterson LL, Buchholz NC, Schlichting RD (1989) Preserving and using context information in interprocess communication. ACM Trans Comput Syst 7(3):217–246Pomares HernĂĄndez S, Fanchon J, Drira K, Diaz M (2001) Causal broadcast protocol for very large group communication systems. In: 5th Intnl Conf on Princ of Distrib Syst (OPODIS), Manzanillo, pp 175–188Prakash R, Baldoni R (2004) Causality and the spatial-temporal ordering in mobile systems. Mobile Netw Appl 9(5):507–516Prakash R, Raynal M, Singhal M (1997) An adaptive causal ordering algorithm suited to mobile computing environments. J Parallel Distrib Comput 41(2):190–204Raynal M, Schiper A, Toueg S (1991) The causal ordering abstraction and a simple way to implement it. Inf Process Lett 39(6):343–350Rodrigues L, VerĂ­ssimo P (1995a) Causal separators and topological timestamping: An approach to support causal multicast in large-scale systems. Tech. Rep. AR-05/95, Instituto de Engenharia de Sistemas e Computadores (INESC), LisbonRodrigues L, VerĂ­ssimo P (1995b) Causal separators for large-scale multicast communication. In: 15th Intnl Conf on Distrib Comput Syst (ICDCS), Vancouver, pp 83–91Schiper A, Eggli J, Sandoz A (1989) A new algorithm to implement causal ordering. In: 3rd Intnl Wshop on Distrib Alg (WDAG), Nice, pp 219–232Schiper N, Pedone F (2010) Fast, flexible and highly resilient genuine FIFO and causal multicast algorithms. In: 25th ACM Symp on Applied Comp (SAC), Sierre, pp 418–422Shapiro M, Preguiça NM, Baquero C, Zawirski M (2011) Convergent and commutative replicated data types. Bull EATCS 104:67–88Shen M, Kshemkalyani AD, Hsu TY (2015) Causal consistency for geo-replicated cloud storage under partial replication. In: Intnl Paral Distrib Proces Symp (IPDPS) Wshop, Hyderabad, pp 509–518Singhal M, Kshemkalyani AD (1992) An efficient implementation of vector clocks. Inf Process Lett 43(1):47–52Sotomayor B, Montero RS, Llorente IM, Foster IT (2009) Virtual infrastructure management in private and hybrid clouds. IEEE Internet Comput 13(5):14–22Stephenson P (1991) Fast ordered multicasts. PhD thesis, Dept. of Comp. Sc., Cornell Univ., IthacaStonebraker M (1986) The case for shared nothing. IEEE Database Eng Bull 9(1):4–9Vogels W (2009) Eventually consistent. Commun ACM 52(1):40–44Wischhof L, Ebner A, Rohling H (2005) Information dissemination in self-organizing intervehicle networks. IEEE Trans Intell Transp 6(1):90–101Yavatkar R (1992) MCP: a protocol for coordination and temporal synchronization in multimedia collaborative applications. In: 12th Intnl Conf on Distrib Comput Syst (ICDCS), Yokohama, pp 606–613Yen LH, Huang TL, Hwang SY (1997) A protocol for causally ordered message delivery in mobile computing systems. Mobile Netw Appl 2(4):365–372Zawirski M, Preguiça N, Duarte S, Bieniusa A, Balegas V, Shapiro M (2015) Write fast, read in the past: causal consistency for client-side applications. In: 16th Intnl Middleware Conf, VancouverZhou S, Cai W, Turner SJ, Lee BS, Wei J (2007) Critical causal order of events in distributed virtual environments. ACM Trans Mult Comp Commun Appl 3(3):1

    Relaxed Queues and Stacks from Read/Write Operations

    Get PDF
    Considering asynchronous shared memory systems in which any number of processes may crash, this work identifies and formally defines relaxations of queues and stacks that can be non-blocking or wait-free while being implemented using only read/write operations. Set-linearizability and Interval-linearizability are used to specify the relaxations formally, and precisely identify the subset of executions which preserve the original sequential behavior. The relaxations allow for an item to be returned more than once by different operations, but only in case of concurrency; we call such a property multiplicity. The stack implementation is wait-free, while the queue implementation is non-blocking. Interval-linearizability is used to describe a queue with multiplicity, with the additional relaxation that a dequeue operation can return weak-empty, which means that the queue might be empty. We present a read/write wait-free interval-linearizable algorithm of a concurrent queue. As far as we know, this work is the first that provides formalizations of the notions of multiplicity and weak-emptiness, which can be implemented on top of read/write registers only

    Distributed eventual leader election in the crash-recovery and general omission failure models.

    Get PDF
    102 p.Distributed applications are present in many aspects of everyday life. Banking, healthcare or transportation are examples of such applications. These applications are built on top of distributed systems. Roughly speaking, a distributed system is composed of a set of processes that collaborate among them to achieve a common goal. When building such systems, designers have to cope with several issues, such as different synchrony assumptions and failure occurrence. Distributed systems must ensure that the delivered service is trustworthy.Agreement problems compose a fundamental class of problems in distributed systems. All agreement problems follow the same pattern: all processes must agree on some common decision. Most of the agreement problems can be considered as a particular instance of the Consensus problem. Hence, they can be solved by reduction to consensus. However, a fundamental impossibility result, namely (FLP), states that in an asynchronous distributed system it is impossible to achieve consensus deterministically when at least one process may fail. A way to circumvent this obstacle is by using unreliable failure detectors. A failure detector allows to encapsulate synchrony assumptions of the system, providing (possibly incorrect) information about process failures. A particular failure detector, called Omega, has been shown to be the weakest failure detector for solving consensus with a majority of correct processes. Informally, Omega lies on providing an eventual leader election mechanism

    Totally Ordered Broadcast and Multicast Algorithms: A Comprehensive Survey

    Get PDF
    Total order multicast algorithms constitute an important class of problems in distributed systems, especially in the context of fault-tolerance. In short, the problem of total order multicast consists in sending messages to a set of processes, in such a way that all messages are delivered by all correct destinations in the same order. However, the huge amount of literature on the subject and the plethora of solutions proposed so far make it difficult for practitioners to select a solution adapted to their specific problem. As a result, naive solutions are often used while better solutions are ignored. This paper proposes a classification of total order multicast algorithms based on the ordering mechanism of the algorithms, and describes a set of common characteristics (e.g., assumptions, properties) with which to evaluate them. In this classification, more than fifty total order broadcast and multicast algorithms are surveyed. The presentation includes asynchronous algorithms as well as algorithms based on the more restrictive synchronous model. Fault-tolerance issues are also considered as the paper studies the properties and behavior of the different algorithms with respect to failures

    Agreement-related problems:from semi-passive replication to totally ordered broadcast

    Get PDF
    Agreement problems constitute a fundamental class of problems in the context of distributed systems. All agreement problems follow a common pattern: all processes must agree on some common decision, the nature of which depends on the specific problem. This dissertation mainly focuses on three important agreements problems: Replication, Total Order Broadcast, and Consensus. Replication is a common means to introduce redundancy in a system, in order to improve its availability. A replicated server is a server that is composed of multiple copies so that, if one copy fails, the other copies can still provide the service. Each copy of the server is called a replica. The replicas must all evolve in manner that is consistent with the other replicas. Hence, updating the replicated server requires that every replica agrees on the set of modifications to carry over. There are two principal replication schemes to ensure this consistency: active replication and passive replication. In Total Order Broadcast, processes broadcast messages to all processes. However, all messages must be delivered in the same order. Also, if one process delivers a message m, then all correct processes must eventually deliver m. The problem of Consensus gives an abstraction to most other agreement problems. All processes initiate a Consensus by proposing a value. Then, all processes must eventually decide the same value v that must be one of the proposed values. These agreement problems are closely related to each other. For instance, Chandra and Toueg [CT96] show that Total Order Broadcast and Consensus are equivalent problems. In addition, Lamport [Lam78] and Schneider [Sch90] show that active replication needs Total Order Broadcast. As a result, active replication is also closely related to the Consensus problem. The first contribution of this dissertation is the definition of the semi-passive replication technique. Semi-passive replication is a passive replication scheme based on a variant of Consensus (called Lazy Consensus and also defined here). From a conceptual point of view, the result is important as it helps to clarify the relation between passive replication and the Consensus problem. In practice, this makes it possible to design systems that react more quickly to failures. The problem of Total Order Broadcast is well-known in the field of distributed systems and algorithms. In fact, there have been already more than fifty algorithms published on the problem so far. Although quite similar, it is difficult to compare these algorithms as they often differ with respect to their actual properties, assumptions, and objectives. The second main contribution of this dissertation is to define five classes of total order broadcast algorithms, and to relate existing algorithms to those classes. The third contribution of this dissertation is to compare the expected performance of the various classes of total order broadcast algorithms. To achieve this goal, we define a set of metrics to predict the performance of distributed algorithms

    Self-Stabilization in the Distributed Systems of Finite State Machines

    Get PDF
    The notion of self-stabilization was first proposed by Dijkstra in 1974 in his classic paper. The paper defines a system as self-stabilizing if, starting at any, possibly illegitimate, state the system can automatically adjust itself to eventually converge to a legitimate state in finite amount of time and once in a legitimate state it will remain so unless it incurs a subsequent transient fault. Dijkstra limited his attention to a ring of finite-state machines and provided its solution for self-stabilization. In the years following his introduction, very few papers were published in this area. Once his proposal was recognized as a milestone in work on fault tolerance, the notion propagated among the researchers rapidly and many researchers in the distributed systems diverted their attention to it. The investigation and use of self-stabilization as an approach to fault-tolerant behavior under a model of transient failures for distributed systems is now undergoing a renaissance. A good number of works pertaining to self-stabilization in the distributed systems were proposed in the yesteryears most of which are very recent. This report surveys all previous works available in the literature of self-stabilizing systems

    Exact bounds for distributed graph colouring

    Get PDF
    A distributed system is a collection of networked autonomous processing units which must work in a cooperative manner. Currently, large-scale distributed systems, such as various telecommunication and computer networks, are abundant and used in a multitude of tasks. The field of distributed computing studies what can be computed efficiently in such systems. Distributed systems are usually modelled as graphs where nodes represent the processors and edges denote communication links between processors. This thesis concentrates on the computational complexity of the distributed graph colouring problem. The objective of the graph colouring problem is to assign a colour to each node in such a way that no two nodes connected by an edge share the same colour. In particular, it is often desirable to use only a small number of colours. This task is a fundamental symmetry-breaking primitive in various distributed algorithms. A graph that has been coloured in this manner using at most k different colours is said to be k-coloured. This work examines the synchronous message-passing model of distributed computation: every node runs the same algorithm, and the system operates in discrete synchronous communication rounds. During each round, a node can communicate with its neighbours and perform local computation. In this model, the time complexity of a problem is the number of synchronous communication rounds required to solve the problem. It is known that 3-colouring any k-coloured directed cycle requires at least Âœ(log* k - 3) communication rounds and is possible in Âœ(log* k + 7) communication rounds for all k ≄ 3. This work shows that for any k ≄ 3, colouring a k-coloured directed cycle with at most three colours is possible in Âœ(log* k + 3) rounds. In contrast, it is also shown that for some values of k, colouring a directed cycle with at most three colours requires at least Âœ(log* k + 1) communication rounds. Furthermore, in the case of directed rooted trees, reducing a k-colouring into a 3-colouring requires at least log* k + 1 rounds for some k and possible in log* k + 3 rounds for all k ≄ 3. The new positive and negative results are derived using computational methods, as the existence of distributed colouring algorithms corresponds to the colourability of so-called neighbourhood graphs. The colourability of these graphs is analysed using Boolean satisfiability (SAT) solvers. Finally, this thesis shows that similar methods are applicable in capturing the existence of distributed algorithms for other graph problems, such as the maximal matching problem
    • 

    corecore