Location of Repository

Automatic Software Upgrades for Distributed Systems (PhD thesis)

By Sameer Ajmani


Upgrading the software of long-lived, highly-available distributedsystems is difficult. It is not possible to upgrade all the nodes in asystem at once, since some nodes may be unavailable and halting thesystem for an upgrade is unacceptable. Instead, upgrades may happengradually, and there may be long periods of time when different nodesare running different software versions and need to communicate usingincompatible protocols. We present a methodology and infrastructurethat address these challenges and make it possible to upgradedistributed systems automatically while limiting service disruption.Our methodology defines how to enable nodes to interoperate acrossversions, how to preserve the state of a system across upgrades, and howto schedule an upgrade so as to limit service disruption. The approachis modular: defining an upgrade requires understanding only the newsoftware and the version it replaces.The upgrade infrastructure is a generic platform for distributing andinstalling software while enabling nodes to interoperate acrossversions. The infrastructure requires no access to the system sourcecode and is transparent: node software is unaware that differentversions even exist. We have implemented a prototype of theinfrastructure called Upstart that intercepts socket communication usinga dynamically-linked C++ library. Experiments show that Upstart has lowoverhead and works well for both local-area and Internet systems

Year: 2005
OAI identifier: oai:dspace.mit.edu:1721.1/30418
Provided by: DSpace@MIT
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://hdl.handle.net/1721.1/3... (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.