4,429 research outputs found

    Scalability approaches for causal multicast: a survey

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/s00607-015-0479-0Many distributed services need to be scalable: internet search, electronic commerce, e-government... In order to achieve scalability, high availability and fault tolerance, such applications rely on replicated components. Because of the dynamics of growth and volatility of customer markets, applications need to be hosted by adaptive, highly scalable systems. In particular, the scalability of the reliable multicast mechanisms used for supporting the consistency of replicas is of crucial importance. Reliable multicast might propagate updates in a pre-determined order (e.g., FIFO, total or causal). Since total order needs more communication rounds than causal order, the latter appears to be the preferable candidate for achieving multicast scalability, although the consistency guarantees based on causal order are weaker than those of total order. This paper provides a historical survey of different scalability approaches for reliable causal multicast protocols.This work was supported by European Regional Development Fund (FEDER) and Ministerio de Economia y Competitividad (MINECO) under research Grant TIN2012-37719-C03-01.Juan MarĂ­n, RD.; Decker, H.; ArmendĂĄriz ĂĂ±igo, JE.; Bernabeu AubĂĄn, JM.; Muñoz EscoĂ­, FD. (2016). Scalability approaches for causal multicast: a survey. Computing. 98(9):923-947. https://doi.org/10.1007/s00607-015-0479-0S923947989Adly N, Nagi M (1995) Maintaining causal order in large scale distributed systems using a logical hierarchy. In: IASTED Intnl Conf on Appl Inform, pp 214–219Aguilera MK, Chen W, Toueg S (1997) Heartbeat: a timeout-free failure detector for quiescent reliable communication. In: 11th Intnl Wshop on Distrib Alg (WDAG), SaarbrĂŒcken, pp 126–140Almeida JB, Almeida PS, Baquero C (2004) Bounded version vectors. In: 18th Intnl Conf Distrib Comput (DISC), Amsterdam, pp 102–116Almeida PS, Baquero C, Fonte V (2008) Interval tree clocks. In: 12th Intnl Conf Distrib Syst (OPODIS), Luxor, pp 259–274Almeida S, LeitĂŁo J, Rodrigues LET (2013) ChainReaction: a causal+ consistent datastore based on chain replication. In: 8th EuroSys Conf, Czech Republic, pp 85–98Álvarez A, ArĂ©valo S, Cholvi V, FernĂĄndez A, JimĂ©nez E (2008) On the interconnection of message passing systems. Inf Process Lett 105(6):249–254Amir Y, Stanton J (1998) The Spread wide area group communication system. Tech. rep., CDNS-98-4, The Center for Networking and Distributed Systems, The Johns Hopkins UnivAmir Y, Dolev D, Kramer S, Malki D (1992) Transis: a communication subsystem for high availability. In: 22nd Intnl Symp Fault-Tolerant Comp (FTCS), Boston, pp 76–84Anastasi G, Bartoli A, Spadoni F (2001) A reliable multicast protocol for distributed mobile systems: design and evaluation. IEEE Trans Parallel Distrib Syst 12(10):1009–1022Bailis P, Ghodsi A, Hellerstein JM, Stoica I (2013) Bolt-on causal consistency. In: Intnl Conf Mgmnt Data (SIGMOD), New York, pp 761–772Baldoni R, Raynal M, Prakash R, Singhal M (1996) Broadcast with time and causality constraints for multimedia applications. In: 22nd Intnl Euromicro Conf, Prague, pp 617–624Baldoni R, Friedman R, van Renesse R (1997) The hierarchical daisy architecture for causal delivery. In: 17th Intnl Conf Distrib Comput Syst (ICDCS), Maryland, pp 570–577Ban B (2002) JGroups—a toolkit for reliable multicast communication. http://www.jgroups.orgBaquero C, Almeida PS, Shoker A (2014) Making operation-based CRDTs operation-based. In: 14th Intnl Conf Distrib Appl Interop Syst (DAIS), Berlin, pp 126–140Benslimane A, Abouaissa A (2002) Dynamical grouping model for distributed real time causal ordering. Comput Commun 25:288–302Birman KP, Joseph TA (1987) Reliable communication in the presence of failures. ACM Trans Comput Syst 5(1):47–76Birman KP, Schiper A, Stephenson P (1991) Lightweigt causal and atomic group multicast. ACM Trans Comput Syst 9(3):272–314Cachin C, Guerraoui R, Rodrigues LET (2011) Introduction to reliable and secure distributed programming, 2nd edn. Springer, BerlinChandra P, Gambhire P, Kshemkalyani AD (2004) Performance of the optimal causal multicast algorithm: a statistical analysis. IEEE Trans Parall Distr 15(1):40–52Chandra TD, Toueg S (1996) Unreliable failure detectors for reliable distributed systems. J ACM 43(2):225–267de Juan-MarĂ­n R, Cholvi V, JimĂ©nez E, Muñoz-EscoĂ­ FD (2009) Parallel interconnection of broadcast systems with multiple FIFO channels. In: 11th Intnl Symp on Distrib Obj, Middleware and Appl (DOA), Vilamoura, LNCS, vol 5870, pp 449–466DĂ©fago X, Schiper A, UrbĂĄn P (2004) Total order broadcast and multicast algorithms: taxonomy and survey. ACM Comput Surv 36(4):372–421Demers AJ, Greene DH, Hauser C, Irish W, Larson J, Shenker S, Sturgis HE, Swinehart DC, Terry DB (1987) Epidemic algorithms for replicated database maintenance. In: 6th ACM Symp on Princ of Distrib Comput (PODC), Canada, pp 1–12Du J, Elnikety S, Roy A, Zwaenepoel W (2013) Orbe: scalable causal consistency using dependency matrices and physical clocks. In: ACM Symp on Cloud Comput (SoCC), Santa Clara, pp 11:1–11:14FernĂĄndez A, JimĂ©nez E, Cholvi V (2000) On the interconnection of causal memory systems. In: 19th Annual ACM Symp on Princ of Distrib Comput (PODC), Portland, pp 163–170Fidge CJ (1988) Timestamps in message-passing systems that preserve the partial ordering. In: 11th Australian Comput Conf, pp 56–66Friedman R, Vitenberg R, Chockler G (2003) On the composability of consistency conditions. Inf Process Lett 86(4):169–176Gilbert S, Lynch N (2002) Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33(2):51–59Gray J, Helland P, O’Neil PE, Shasha D (1996) The dangers of replication and a solution. In: SIGMOD Conf, pp 173–182Hadzilacos V, Toueg S (1993) Fault-tolerant broadcasts and related problems. In: Mullender S (ed) Distributed systems, chap 5, 2nd edn. ACM Press, pp 97–145Johnson S, Jahanian F, Shah J (1999) The inter-group router approach to scalable group composition. In: 19th Intnl Conf on Distrib Comput Syst (ICDCS), Austin, pp 4–14Kalantar MH, Birman KP (1999) Causally ordered multicast: the conservative approach. In: 19th Intnl Conf on Distrib Comput Syst (ICDCS), Austin, pp 36–44Kawanami S, Enokido T, Takizawa M (2004) A group communication protocol for scalable causal ordering. In: 18th Intnl Conf on Adv Inform Netw Appl (AINA), Fukuoka, pp 296–302Kawanami S, Nishimura T, Enokido T, Takizawa M (2005) A scalable group communication protocol with global clock. In: 19th Intnl Conf on Adv Inform Netw Appl (AINA), Taipei, pp 625–630Kshemkalyani AD, Singhal M (1998) Necessary and sufficient conditions on information for causal message ordering and their optimal implementation. Distrib Comput 11(2):91–111Kshemkalyani AD, Singhal M (2011) Distributed computing: principles, algorithms, and systems, 2nd edn. Cambridge University Press, New YorkLadin R, Liskov B, Shrira L, Ghemawat S (1992) Providing high availability using lazy replication. ACM Trans Comput Syst 10(4):360–391Lamport L (1978) Time, clocks, and the ordering of events in a distributed system. Commun ACM 21(7):558–565Laumay P, Bruneton E, de Palma N, Krakowiak S (2001) Preserving causality in a scalable message-oriented middleware. In: Intnl Conf on Distrib Syst Platf (Middleware), pp 311–328Liu N, Liu M, Cao J, Chen G, Lou W (2010) When transportation meets communication: V2P over VANETs. In: 30th Intnl Conf Distrib Comput Syst (ICDCS), GenovaLwin CH, Mohanty H, Ghosh RK (2004) Causal ordering in event notification service systems for mobile users. In: Intnl Conf Inform Tech: Coding Comput (ITCC), Las Vegas, pp 735–740Mahajan P, Alvisi L, Dahlin M (2011) Consistency, availability and covergence. Tech. rep., UTCS TR-11-22, The University of Texas at AustinMatos M, Sousa A, Pereira J, Oliveira R, Deliot E, Murray P (2009) CLON: overlay networks and gossip protocols for cloud environments. In: 11th Intnl Symp on Dist Obj, Middleware and Appl (DOA), Vilamoura, LNCS, vol 5870, pp 549–566Mattern F (1989) Virtual time and global states of distributed systems. In: Parallel and distributed algorithms, North-Holland, pp 215–226Mattern F, FĂŒnfrocken S (1994) A non-blocking lightweight implementation of causal order message delivery. Lect Notes Comput Sci 938:197–213Meldal S, Sankar S, Vera J (1991) Exploiting locality in maintaining potential causality. In: 10th ACM Symp on Princ of Distrib Comp (PODC), Montreal, pp 231–239Meling H, Montresor A, Helvik BE, Babaoglu Ö (2008) Jgroup/ARM: a distributed object group platform with autonomous replication management. Softw Pract Exp 38(9):885–923Mosberger D (1993) Memory consistency models. Oper Syst Rev 27(1):18–26MostĂ©faoui A, Raynal M (1993) Causal multicast in overlapping groups: towards a low cost approach. In: 4th Intnl Wshop on Future Trends of Distrib Comp Syst (FTDCS), Lisbon, pp 136–142MostĂ©faoui A, Raynal M, Travers C, Patterson S, Agrawal D, El Abbadi A (2005) From static distributed systems to dynamic systems. In: 24th Symp on Rel Distrib Syst (SRDS), Orlando, pp 109–118Nishimura T, Hayashibara N, Takizawa M, Enokido T (2005) Causally ordered delivery with global clock in hierarchical group. In: ICPADS (2), Fukuoka, pp 560–564Parker DS Jr, Popek GJ, Rudisin G, Stoughton A, Walker BJ, Walton E, Chow JM, Edwards DA, Kiser S, Kline CS (1983) Detection of mutual inconsistency in distributed systems. IEEE Trans Softw Eng 9(3):240–247Pascual-Miret L (2014) Consistency models in modern distributed systems. An approach to eventual consistency. Master’s thesis, Depto. de Sistemas InformĂĄticos y ComputaciĂłn, Univ. PolitĂšcnica de ValĂšnciaPascual-Miret L, GonzĂĄlez de MendĂ­vil JR, BernabĂ©u-AubĂĄn JM, Muñoz-EscoĂ­ FD (2015) Widening CAP consistency. Tech. rep., IUMTI-SIDI-2015/003, Univ. PolitĂšcnica de ValĂšncia, ValenciaPeterson LL, Buchholz NC, Schlichting RD (1989) Preserving and using context information in interprocess communication. ACM Trans Comput Syst 7(3):217–246Pomares HernĂĄndez S, Fanchon J, Drira K, Diaz M (2001) Causal broadcast protocol for very large group communication systems. In: 5th Intnl Conf on Princ of Distrib Syst (OPODIS), Manzanillo, pp 175–188Prakash R, Baldoni R (2004) Causality and the spatial-temporal ordering in mobile systems. Mobile Netw Appl 9(5):507–516Prakash R, Raynal M, Singhal M (1997) An adaptive causal ordering algorithm suited to mobile computing environments. J Parallel Distrib Comput 41(2):190–204Raynal M, Schiper A, Toueg S (1991) The causal ordering abstraction and a simple way to implement it. Inf Process Lett 39(6):343–350Rodrigues L, VerĂ­ssimo P (1995a) Causal separators and topological timestamping: An approach to support causal multicast in large-scale systems. Tech. Rep. AR-05/95, Instituto de Engenharia de Sistemas e Computadores (INESC), LisbonRodrigues L, VerĂ­ssimo P (1995b) Causal separators for large-scale multicast communication. In: 15th Intnl Conf on Distrib Comput Syst (ICDCS), Vancouver, pp 83–91Schiper A, Eggli J, Sandoz A (1989) A new algorithm to implement causal ordering. In: 3rd Intnl Wshop on Distrib Alg (WDAG), Nice, pp 219–232Schiper N, Pedone F (2010) Fast, flexible and highly resilient genuine FIFO and causal multicast algorithms. In: 25th ACM Symp on Applied Comp (SAC), Sierre, pp 418–422Shapiro M, Preguiça NM, Baquero C, Zawirski M (2011) Convergent and commutative replicated data types. Bull EATCS 104:67–88Shen M, Kshemkalyani AD, Hsu TY (2015) Causal consistency for geo-replicated cloud storage under partial replication. In: Intnl Paral Distrib Proces Symp (IPDPS) Wshop, Hyderabad, pp 509–518Singhal M, Kshemkalyani AD (1992) An efficient implementation of vector clocks. Inf Process Lett 43(1):47–52Sotomayor B, Montero RS, Llorente IM, Foster IT (2009) Virtual infrastructure management in private and hybrid clouds. IEEE Internet Comput 13(5):14–22Stephenson P (1991) Fast ordered multicasts. PhD thesis, Dept. of Comp. Sc., Cornell Univ., IthacaStonebraker M (1986) The case for shared nothing. IEEE Database Eng Bull 9(1):4–9Vogels W (2009) Eventually consistent. Commun ACM 52(1):40–44Wischhof L, Ebner A, Rohling H (2005) Information dissemination in self-organizing intervehicle networks. IEEE Trans Intell Transp 6(1):90–101Yavatkar R (1992) MCP: a protocol for coordination and temporal synchronization in multimedia collaborative applications. In: 12th Intnl Conf on Distrib Comput Syst (ICDCS), Yokohama, pp 606–613Yen LH, Huang TL, Hwang SY (1997) A protocol for causally ordered message delivery in mobile computing systems. Mobile Netw Appl 2(4):365–372Zawirski M, Preguiça N, Duarte S, Bieniusa A, Balegas V, Shapiro M (2015) Write fast, read in the past: causal consistency for client-side applications. In: 16th Intnl Middleware Conf, VancouverZhou S, Cai W, Turner SJ, Lee BS, Wei J (2007) Critical causal order of events in distributed virtual environments. ACM Trans Mult Comp Commun Appl 3(3):1

    CRDTs: Consistency without concurrency control

    Get PDF
    A CRDT is a data type whose operations commute when they are concurrent. Replicas of a CRDT eventually converge without any complex concurrency control. As an existence proof, we exhibit a non-trivial CRDT: a shared edit buffer called Treedoc. We outline the design, implementation and performance of Treedoc. We discuss how the CRDT concept can be generalised, and its limitations

    VCube-PS: A Causal Broadcast Topic-based Publish/Subscribe System

    Get PDF
    In this work we present VCube-PS, a topic-based Publish/Subscribe system built on the top of a virtual hypercube-like topology. Membership information and published messages are broadcast to subscribers (members) of a topic group over dynamically built spanning trees rooted at the publisher. For a given topic, the delivery of published messages respects the causal order. VCube-PS was implemented on the PeerSim simulator, and experiments are reported including a comparison with the traditional Publish/Subscribe approach that employs a single rooted static spanning-tree for message distribution. Results confirm the efficiency of VCube-PS in terms of scalability, latency, number and size of messages.Comment: Improved text and performance evaluation. Added proof for the algorithms (Section 3.4

    Fisheye Consistency: Keeping Data in Synch in a Georeplicated World

    Get PDF
    Over the last thirty years, numerous consistency conditions for replicated data have been proposed and implemented. Popular examples of such conditions include linearizability (or atomicity), sequential consistency, causal consistency, and eventual consistency. These consistency conditions are usually defined independently from the computing entities (nodes) that manipulate the replicated data; i.e., they do not take into account how computing entities might be linked to one another, or geographically distributed. To address this lack, as a first contribution, this paper introduces the notion of proximity graph between computing nodes. If two nodes are connected in this graph, their operations must satisfy a strong consistency condition, while the operations invoked by other nodes are allowed to satisfy a weaker condition. The second contribution is the use of such a graph to provide a generic approach to the hybridization of data consistency conditions into the same system. We illustrate this approach on sequential consistency and causal consistency, and present a model in which all data operations are causally consistent, while operations by neighboring processes in the proximity graph are sequentially consistent. The third contribution of the paper is the design and the proof of a distributed algorithm based on this proximity graph, which combines sequential consistency and causal consistency (the resulting condition is called fisheye consistency). In doing so the paper not only extends the domain of consistency conditions, but provides a generic provably correct solution of direct relevance to modern georeplicated systems

    PaRiS: Causally Consistent Transactions with Non-blocking Reads and Partial Replication

    Get PDF
    Geo-replicated data platforms are at the backbone of several large-scale online services. Transactional Causal Consistency (TCC) is an attractive consistency level for building such platforms. TCC avoids many anomalies of eventual consistency, eschews the synchronization costs of strong consistency, and supports interactive read-write transactions. Partial replication is another attractive design choice for building geo-replicated platforms, as it increases the storage capacity and reduces update propagation costs. This paper presents PaRiS, the first TCC system that supports partial replication and implements non-blocking parallel read operations, whose latency is paramount for the performance of read-intensive applications. PaRiS relies on a novel protocol to track dependencies, called Universal Stable Time (UST). By means of a lightweight background gossip process, UST identifies a snapshot of the data that has been installed by every DC in the system. Hence, transactions can consistently read from such a snapshot on any server in any replication site without having to block. Moreover, PaRiS requires only one timestamp to track dependencies and define transactional snapshots, thereby achieving resource efficiency and scalability. We evaluate PaRiS on a large-scale AWS deployment composed of up to 10 replication sites. We show that PaRiS scales well with the number of DCs and partitions, while being able to handle larger data-sets than existing solutions that assume full replication. We also demonstrate a performance gain of non-blocking reads vs. a blocking alternative (up to 1.47x higher throughput with 5.91x lower latency for read-dominated workloads and up to 1.46x higher throughput with 20.56x lower latency for write-heavy workloads)

    Eventual Consistency: Origin and Support

    Get PDF
    Eventual consistency is demanded nowadays in geo-replicated services that need to be highly scalable and available. According to the CAP constraints, when network partitions may arise, a distributed service should choose between being strongly consistent or being highly available. Since scalable services should be available, a relaxed consistency (while the network is partitioned) is the preferred choice. Eventual consistency is not a common data-centric consistency model, but only a state convergence condition to be added to a relaxed consistency model. There are still several aspects of eventual consistency that have not been analysed in depth in previous works: 1. which are the oldest replication proposals providing eventual consistency, 2. which replica consistency models provide the best basis for building eventually consistent services, 3. which mechanisms should be considered for implementing an eventually consistent service, and 4. which are the best combinations of those mechanisms for achieving different concrete goals. This paper provides some notes on these important topics
    • 

    corecore