Skip to main content
Article thumbnail
Location of Repository

Flexible Symmetrical Global-Snapshot Algorithmsfor Large-Scale Distributed Systems

By Jichiang Tsai


Most existing global-snapshot algorithms in distributedsystems use control messages to coordinate the constructionof a global snapshot among all processes. Since these algorithmstypically assume the underlying logical overlay topologyis fully connected, the number of control messages exchangedamong the whole processes is proportional to the square ofnumber of processes, resulting in higher possibility of networkcongestion. Hence, such algorithms are neither efficient norscalable for a large-scale distributed system composed of a hugenumber of processes. Recently, some efforts have been presentedto significantly reduce the number of control messages, but doingso incurs higher response time instead. In this paper, we proposean efficient global-snapshot algorithm able to let every processfinish its local snapshot in a given number of rounds. Particularly,such an algorithm allows a tradeoff between the response timeand the message complexity. Moreover, our global-snapshotalgorithm is symmetrical in the sense that identical steps areexecuted by every process. This means that our algorithm is ableto achieve better workload balance and less network congestion.Most importantly, based on our framework, we demonstratethat the minimum number of control messages required by asymmetrical global-snapshot algorithm is ­(N logN), where Nis the number of processes. Finally, we also assume non-FIFOchannels

Topics: Distributed Systems, Global Snapshots, ProcessSymmetry, Message Passing, Checkpointing
Year: 2014
OAI identifier:
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.