6,527 research outputs found
Checkpointing as a Service in Heterogeneous Cloud Environments
A non-invasive, cloud-agnostic approach is demonstrated for extending
existing cloud platforms to include checkpoint-restart capability. Most cloud
platforms currently rely on each application to provide its own fault
tolerance. A uniform mechanism within the cloud itself serves two purposes: (a)
direct support for long-running jobs, which would otherwise require a custom
fault-tolerant mechanism for each application; and (b) the administrative
capability to manage an over-subscribed cloud by temporarily swapping out jobs
when higher priority jobs arrive. An advantage of this uniform approach is that
it also supports parallel and distributed computations, over both TCP and
InfiniBand, thus allowing traditional HPC applications to take advantage of an
existing cloud infrastructure. Additionally, an integrated health-monitoring
mechanism detects when long-running jobs either fail or incur exceptionally low
performance, perhaps due to resource starvation, and proactively suspends the
job. The cloud-agnostic feature is demonstrated by applying the implementation
to two very different cloud platforms: Snooze and OpenStack. The use of a
cloud-agnostic architecture also enables, for the first time, migration of
applications from one cloud platform to another.Comment: 20 pages, 11 figures, appears in CCGrid, 201
A Literature Survey of Cooperative Caching in Content Distribution Networks
Content distribution networks (CDNs) which serve to deliver web objects
(e.g., documents, applications, music and video, etc.) have seen tremendous
growth since its emergence. To minimize the retrieving delay experienced by a
user with a request for a web object, caching strategies are often applied -
contents are replicated at edges of the network which is closer to the user
such that the network distance between the user and the object is reduced. In
this literature survey, evolution of caching is studied. A recent research
paper [15] in the field of large-scale caching for CDN was chosen to be the
anchor paper which serves as a guide to the topic. Research studies after and
relevant to the anchor paper are also analyzed to better evaluate the
statements and results of the anchor paper and more importantly, to obtain an
unbiased view of the large scale collaborate caching systems as a whole.Comment: 5 pages, 5 figure
- âŠ