4,429 research outputs found
Scalability approaches for causal multicast: a survey
The final publication is available at Springer via http://dx.doi.org/10.1007/s00607-015-0479-0Many distributed services need to be scalable: internet search,
electronic commerce, e-government... In order to
achieve scalability, high availability and fault tolerance, such
applications rely on replicated components. Because of the dynamics
of growth and volatility of customer markets, applications need to be
hosted by adaptive, highly scalable systems. In particular, the
scalability of the reliable multicast mechanisms used for supporting
the consistency of replicas is of crucial importance. Reliable
multicast might propagate updates in a pre-determined order (e.g.,
FIFO, total or causal). Since total order needs more communication
rounds than causal order, the latter appears to be the preferable
candidate for achieving multicast scalability, although the
consistency guarantees based on causal order are weaker than those of
total order. This paper provides a historical survey of different
scalability approaches for reliable causal multicast protocols.This work was supported by European Regional Development Fund (FEDER) and Ministerio de Economia y Competitividad (MINECO) under research Grant TIN2012-37719-C03-01.Juan MarĂn, RD.; Decker, H.; ArmendĂĄriz Ăñigo, JE.; Bernabeu AubĂĄn, JM.; Muñoz EscoĂ, FD. (2016). Scalability approaches for causal multicast: a survey. Computing. 98(9):923-947. https://doi.org/10.1007/s00607-015-0479-0S923947989Adly N, Nagi M (1995) Maintaining causal order in large scale distributed systems using a logical hierarchy. In: IASTED Intnl Conf on Appl Inform, pp 214â219Aguilera MK, Chen W, Toueg S (1997) Heartbeat: a timeout-free failure detector for quiescent reliable communication. In: 11th Intnl Wshop on Distrib Alg (WDAG), SaarbrĂŒcken, pp 126â140Almeida JB, Almeida PS, Baquero C (2004) Bounded version vectors. In: 18th Intnl Conf Distrib Comput (DISC), Amsterdam, pp 102â116Almeida PS, Baquero C, Fonte V (2008) Interval tree clocks. In: 12th Intnl Conf Distrib Syst (OPODIS), Luxor, pp 259â274Almeida S, LeitĂŁo J, Rodrigues LET (2013) ChainReaction: a causal+ consistent datastore based on chain replication. In: 8th EuroSys Conf, Czech Republic, pp 85â98Ălvarez A, ArĂ©valo S, Cholvi V, FernĂĄndez A, JimĂ©nez E (2008) On the interconnection of message passing systems. Inf Process Lett 105(6):249â254Amir Y, Stanton J (1998) The Spread wide area group communication system. Tech. rep., CDNS-98-4, The Center for Networking and Distributed Systems, The Johns Hopkins UnivAmir Y, Dolev D, Kramer S, Malki D (1992) Transis: a communication subsystem for high availability. In: 22nd Intnl Symp Fault-Tolerant Comp (FTCS), Boston, pp 76â84Anastasi G, Bartoli A, Spadoni F (2001) A reliable multicast protocol for distributed mobile systems: design and evaluation. IEEE Trans Parallel Distrib Syst 12(10):1009â1022Bailis P, Ghodsi A, Hellerstein JM, Stoica I (2013) Bolt-on causal consistency. In: Intnl Conf Mgmnt Data (SIGMOD), New York, pp 761â772Baldoni R, Raynal M, Prakash R, Singhal M (1996) Broadcast with time and causality constraints for multimedia applications. In: 22nd Intnl Euromicro Conf, Prague, pp 617â624Baldoni R, Friedman R, van Renesse R (1997) The hierarchical daisy architecture for causal delivery. In: 17th Intnl Conf Distrib Comput Syst (ICDCS), Maryland, pp 570â577Ban B (2002) JGroupsâa toolkit for reliable multicast communication. http://www.jgroups.orgBaquero C, Almeida PS, Shoker A (2014) Making operation-based CRDTs operation-based. In: 14th Intnl Conf Distrib Appl Interop Syst (DAIS), Berlin, pp 126â140Benslimane A, Abouaissa A (2002) Dynamical grouping model for distributed real time causal ordering. Comput Commun 25:288â302Birman KP, Joseph TA (1987) Reliable communication in the presence of failures. ACM Trans Comput Syst 5(1):47â76Birman KP, Schiper A, Stephenson P (1991) Lightweigt causal and atomic group multicast. ACM Trans Comput Syst 9(3):272â314Cachin C, Guerraoui R, Rodrigues LET (2011) Introduction to reliable and secure distributed programming, 2nd edn. Springer, BerlinChandra P, Gambhire P, Kshemkalyani AD (2004) Performance of the optimal causal multicast algorithm: a statistical analysis. IEEE Trans Parall Distr 15(1):40â52Chandra TD, Toueg S (1996) Unreliable failure detectors for reliable distributed systems. J ACM 43(2):225â267de Juan-MarĂn R, Cholvi V, JimĂ©nez E, Muñoz-EscoĂ FD (2009) Parallel interconnection of broadcast systems with multiple FIFO channels. In: 11th Intnl Symp on Distrib Obj, Middleware and Appl (DOA), Vilamoura, LNCS, vol 5870, pp 449â466DĂ©fago X, Schiper A, UrbĂĄn P (2004) Total order broadcast and multicast algorithms: taxonomy and survey. ACM Comput Surv 36(4):372â421Demers AJ, Greene DH, Hauser C, Irish W, Larson J, Shenker S, Sturgis HE, Swinehart DC, Terry DB (1987) Epidemic algorithms for replicated database maintenance. In: 6th ACM Symp on Princ of Distrib Comput (PODC), Canada, pp 1â12Du J, Elnikety S, Roy A, Zwaenepoel W (2013) Orbe: scalable causal consistency using dependency matrices and physical clocks. In: ACM Symp on Cloud Comput (SoCC), Santa Clara, pp 11:1â11:14FernĂĄndez A, JimĂ©nez E, Cholvi V (2000) On the interconnection of causal memory systems. In: 19th Annual ACM Symp on Princ of Distrib Comput (PODC), Portland, pp 163â170Fidge CJ (1988) Timestamps in message-passing systems that preserve the partial ordering. In: 11th Australian Comput Conf, pp 56â66Friedman R, Vitenberg R, Chockler G (2003) On the composability of consistency conditions. Inf Process Lett 86(4):169â176Gilbert S, Lynch N (2002) Brewerâs conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33(2):51â59Gray J, Helland P, OâNeil PE, Shasha D (1996) The dangers of replication and a solution. In: SIGMOD Conf, pp 173â182Hadzilacos V, Toueg S (1993) Fault-tolerant broadcasts and related problems. In: Mullender S (ed) Distributed systems, chap 5, 2nd edn. ACM Press, pp 97â145Johnson S, Jahanian F, Shah J (1999) The inter-group router approach to scalable group composition. In: 19th Intnl Conf on Distrib Comput Syst (ICDCS), Austin, pp 4â14Kalantar MH, Birman KP (1999) Causally ordered multicast: the conservative approach. In: 19th Intnl Conf on Distrib Comput Syst (ICDCS), Austin, pp 36â44Kawanami S, Enokido T, Takizawa M (2004) A group communication protocol for scalable causal ordering. In: 18th Intnl Conf on Adv Inform Netw Appl (AINA), Fukuoka, pp 296â302Kawanami S, Nishimura T, Enokido T, Takizawa M (2005) A scalable group communication protocol with global clock. In: 19th Intnl Conf on Adv Inform Netw Appl (AINA), Taipei, pp 625â630Kshemkalyani AD, Singhal M (1998) Necessary and sufficient conditions on information for causal message ordering and their optimal implementation. Distrib Comput 11(2):91â111Kshemkalyani AD, Singhal M (2011) Distributed computing: principles, algorithms, and systems, 2nd edn. Cambridge University Press, New YorkLadin R, Liskov B, Shrira L, Ghemawat S (1992) Providing high availability using lazy replication. ACM Trans Comput Syst 10(4):360â391Lamport L (1978) Time, clocks, and the ordering of events in a distributed system. Commun ACM 21(7):558â565Laumay P, Bruneton E, de Palma N, Krakowiak S (2001) Preserving causality in a scalable message-oriented middleware. In: Intnl Conf on Distrib Syst Platf (Middleware), pp 311â328Liu N, Liu M, Cao J, Chen G, Lou W (2010) When transportation meets communication: V2P over VANETs. In: 30th Intnl Conf Distrib Comput Syst (ICDCS), GenovaLwin CH, Mohanty H, Ghosh RK (2004) Causal ordering in event notification service systems for mobile users. In: Intnl Conf Inform Tech: Coding Comput (ITCC), Las Vegas, pp 735â740Mahajan P, Alvisi L, Dahlin M (2011) Consistency, availability and covergence. Tech. rep., UTCS TR-11-22, The University of Texas at AustinMatos M, Sousa A, Pereira J, Oliveira R, Deliot E, Murray P (2009) CLON: overlay networks and gossip protocols for cloud environments. In: 11th Intnl Symp on Dist Obj, Middleware and Appl (DOA), Vilamoura, LNCS, vol 5870, pp 549â566Mattern F (1989) Virtual time and global states of distributed systems. In: Parallel and distributed algorithms, North-Holland, pp 215â226Mattern F, FĂŒnfrocken S (1994) A non-blocking lightweight implementation of causal order message delivery. Lect Notes Comput Sci 938:197â213Meldal S, Sankar S, Vera J (1991) Exploiting locality in maintaining potential causality. In: 10th ACM Symp on Princ of Distrib Comp (PODC), Montreal, pp 231â239Meling H, Montresor A, Helvik BE, Babaoglu Ă (2008) Jgroup/ARM: a distributed object group platform with autonomous replication management. Softw Pract Exp 38(9):885â923Mosberger D (1993) Memory consistency models. Oper Syst Rev 27(1):18â26MostĂ©faoui A, Raynal M (1993) Causal multicast in overlapping groups: towards a low cost approach. In: 4th Intnl Wshop on Future Trends of Distrib Comp Syst (FTDCS), Lisbon, pp 136â142MostĂ©faoui A, Raynal M, Travers C, Patterson S, Agrawal D, El Abbadi A (2005) From static distributed systems to dynamic systems. In: 24th Symp on Rel Distrib Syst (SRDS), Orlando, pp 109â118Nishimura T, Hayashibara N, Takizawa M, Enokido T (2005) Causally ordered delivery with global clock in hierarchical group. In: ICPADS (2), Fukuoka, pp 560â564Parker DS Jr, Popek GJ, Rudisin G, Stoughton A, Walker BJ, Walton E, Chow JM, Edwards DA, Kiser S, Kline CS (1983) Detection of mutual inconsistency in distributed systems. IEEE Trans Softw Eng 9(3):240â247Pascual-Miret L (2014) Consistency models in modern distributed systems. An approach to eventual consistency. Masterâs thesis, Depto. de Sistemas InformĂĄticos y ComputaciĂłn, Univ. PolitĂšcnica de ValĂšnciaPascual-Miret L, GonzĂĄlez de MendĂvil JR, BernabĂ©u-AubĂĄn JM, Muñoz-EscoĂ FD (2015) Widening CAP consistency. Tech. rep., IUMTI-SIDI-2015/003, Univ. PolitĂšcnica de ValĂšncia, ValenciaPeterson LL, Buchholz NC, Schlichting RD (1989) Preserving and using context information in interprocess communication. ACM Trans Comput Syst 7(3):217â246Pomares HernĂĄndez S, Fanchon J, Drira K, Diaz M (2001) Causal broadcast protocol for very large group communication systems. In: 5th Intnl Conf on Princ of Distrib Syst (OPODIS), Manzanillo, pp 175â188Prakash R, Baldoni R (2004) Causality and the spatial-temporal ordering in mobile systems. Mobile Netw Appl 9(5):507â516Prakash R, Raynal M, Singhal M (1997) An adaptive causal ordering algorithm suited to mobile computing environments. J Parallel Distrib Comput 41(2):190â204Raynal M, Schiper A, Toueg S (1991) The causal ordering abstraction and a simple way to implement it. Inf Process Lett 39(6):343â350Rodrigues L, VerĂssimo P (1995a) Causal separators and topological timestamping: An approach to support causal multicast in large-scale systems. Tech. Rep. AR-05/95, Instituto de Engenharia de Sistemas e Computadores (INESC), LisbonRodrigues L, VerĂssimo P (1995b) Causal separators for large-scale multicast communication. In: 15th Intnl Conf on Distrib Comput Syst (ICDCS), Vancouver, pp 83â91Schiper A, Eggli J, Sandoz A (1989) A new algorithm to implement causal ordering. In: 3rd Intnl Wshop on Distrib Alg (WDAG), Nice, pp 219â232Schiper N, Pedone F (2010) Fast, flexible and highly resilient genuine FIFO and causal multicast algorithms. In: 25th ACM Symp on Applied Comp (SAC), Sierre, pp 418â422Shapiro M, Preguiça NM, Baquero C, Zawirski M (2011) Convergent and commutative replicated data types. Bull EATCS 104:67â88Shen M, Kshemkalyani AD, Hsu TY (2015) Causal consistency for geo-replicated cloud storage under partial replication. In: Intnl Paral Distrib Proces Symp (IPDPS) Wshop, Hyderabad, pp 509â518Singhal M, Kshemkalyani AD (1992) An efficient implementation of vector clocks. Inf Process Lett 43(1):47â52Sotomayor B, Montero RS, Llorente IM, Foster IT (2009) Virtual infrastructure management in private and hybrid clouds. IEEE Internet Comput 13(5):14â22Stephenson P (1991) Fast ordered multicasts. PhD thesis, Dept. of Comp. Sc., Cornell Univ., IthacaStonebraker M (1986) The case for shared nothing. IEEE Database Eng Bull 9(1):4â9Vogels W (2009) Eventually consistent. Commun ACM 52(1):40â44Wischhof L, Ebner A, Rohling H (2005) Information dissemination in self-organizing intervehicle networks. IEEE Trans Intell Transp 6(1):90â101Yavatkar R (1992) MCP: a protocol for coordination and temporal synchronization in multimedia collaborative applications. In: 12th Intnl Conf on Distrib Comput Syst (ICDCS), Yokohama, pp 606â613Yen LH, Huang TL, Hwang SY (1997) A protocol for causally ordered message delivery in mobile computing systems. Mobile Netw Appl 2(4):365â372Zawirski M, Preguiça N, Duarte S, Bieniusa A, Balegas V, Shapiro M (2015) Write fast, read in the past: causal consistency for client-side applications. In: 16th Intnl Middleware Conf, VancouverZhou S, Cai W, Turner SJ, Lee BS, Wei J (2007) Critical causal order of events in distributed virtual environments. ACM Trans Mult Comp Commun Appl 3(3):1
CRDTs: Consistency without concurrency control
A CRDT is a data type whose operations commute when they are concurrent.
Replicas of a CRDT eventually converge without any complex concurrency control.
As an existence proof, we exhibit a non-trivial CRDT: a shared edit buffer
called Treedoc. We outline the design, implementation and performance of
Treedoc. We discuss how the CRDT concept can be generalised, and its
limitations
VCube-PS: A Causal Broadcast Topic-based Publish/Subscribe System
In this work we present VCube-PS, a topic-based Publish/Subscribe system
built on the top of a virtual hypercube-like topology. Membership information
and published messages are broadcast to subscribers (members) of a topic group
over dynamically built spanning trees rooted at the publisher. For a given
topic, the delivery of published messages respects the causal order. VCube-PS
was implemented on the PeerSim simulator, and experiments are reported
including a comparison with the traditional Publish/Subscribe approach that
employs a single rooted static spanning-tree for message distribution. Results
confirm the efficiency of VCube-PS in terms of scalability, latency, number and
size of messages.Comment: Improved text and performance evaluation. Added proof for the
algorithms (Section 3.4
Fisheye Consistency: Keeping Data in Synch in a Georeplicated World
Over the last thirty years, numerous consistency conditions for replicated
data have been proposed and implemented. Popular examples of such conditions
include linearizability (or atomicity), sequential consistency, causal
consistency, and eventual consistency. These consistency conditions are usually
defined independently from the computing entities (nodes) that manipulate the
replicated data; i.e., they do not take into account how computing entities
might be linked to one another, or geographically distributed. To address this
lack, as a first contribution, this paper introduces the notion of proximity
graph between computing nodes. If two nodes are connected in this graph, their
operations must satisfy a strong consistency condition, while the operations
invoked by other nodes are allowed to satisfy a weaker condition. The second
contribution is the use of such a graph to provide a generic approach to the
hybridization of data consistency conditions into the same system. We
illustrate this approach on sequential consistency and causal consistency, and
present a model in which all data operations are causally consistent, while
operations by neighboring processes in the proximity graph are sequentially
consistent. The third contribution of the paper is the design and the proof of
a distributed algorithm based on this proximity graph, which combines
sequential consistency and causal consistency (the resulting condition is
called fisheye consistency). In doing so the paper not only extends the domain
of consistency conditions, but provides a generic provably correct solution of
direct relevance to modern georeplicated systems
PaRiS: Causally Consistent Transactions with Non-blocking Reads and Partial Replication
Geo-replicated data platforms are at the backbone of several large-scale
online services. Transactional Causal Consistency (TCC) is an attractive
consistency level for building such platforms. TCC avoids many anomalies of
eventual consistency, eschews the synchronization costs of strong consistency,
and supports interactive read-write transactions. Partial replication is
another attractive design choice for building geo-replicated platforms, as it
increases the storage capacity and reduces update propagation costs. This paper
presents PaRiS, the first TCC system that supports partial replication and
implements non-blocking parallel read operations, whose latency is paramount
for the performance of read-intensive applications. PaRiS relies on a novel
protocol to track dependencies, called Universal Stable Time (UST). By means of
a lightweight background gossip process, UST identifies a snapshot of the data
that has been installed by every DC in the system. Hence, transactions can
consistently read from such a snapshot on any server in any replication site
without having to block. Moreover, PaRiS requires only one timestamp to track
dependencies and define transactional snapshots, thereby achieving resource
efficiency and scalability. We evaluate PaRiS on a large-scale AWS deployment
composed of up to 10 replication sites. We show that PaRiS scales well with the
number of DCs and partitions, while being able to handle larger data-sets than
existing solutions that assume full replication. We also demonstrate a
performance gain of non-blocking reads vs. a blocking alternative (up to 1.47x
higher throughput with 5.91x lower latency for read-dominated workloads and up
to 1.46x higher throughput with 20.56x lower latency for write-heavy
workloads)
Eventual Consistency: Origin and Support
Eventual consistency is demanded nowadays in geo-replicated services that need to be highly scalable and available. According to the CAP constraints, when network partitions may arise, a distributed service should choose between being strongly consistent or being highly available. Since scalable services should be available, a relaxed consistency (while the network is partitioned) is the preferred choice. Eventual consistency is not a common data-centric consistency model, but only a state convergence condition to be added to a relaxed consistency model. There are still several aspects of eventual consistency that have not been analysed in depth in previous works: 1. which are the oldest replication proposals providing eventual consistency, 2. which replica consistency models provide the best basis for building eventually consistent services, 3. which mechanisms should be considered for implementing an eventually consistent service, and 4. which are the best combinations of those mechanisms for achieving different concrete goals. This paper provides some notes on these important topics
- âŠ