2,913 research outputs found
Toward Smart Moving Target Defense for Linux Container Resiliency
This paper presents ESCAPE, an informed moving target defense mechanism for
cloud containers. ESCAPE models the interaction between attackers and their
target containers as a "predator searching for a prey" search game. Live
migration of Linux-containers (prey) is used to avoid attacks (predator) and
failures. The entire process is guided by a novel host-based
behavior-monitoring system that seamlessly monitors containers for indications
of intrusions and attacks. To evaluate ESCAPE effectiveness, we simulated the
attack avoidance process based on a mathematical model mimicking the
prey-vs-predator search game. Simulation results show high container survival
probabilities with minimal added overhead.Comment: Published version is available on IEEE Xplore at
http://ieeexplore.ieee.org/document/779685
A Generic Checkpoint-Restart Mechanism for Virtual Machines
It is common today to deploy complex software inside a virtual machine (VM).
Snapshots provide rapid deployment, migration between hosts, dependability
(fault tolerance), and security (insulating a guest VM from the host). Yet, for
each virtual machine, the code for snapshots is laboriously developed on a
per-VM basis. This work demonstrates a generic checkpoint-restart mechanism for
virtual machines. The mechanism is based on a plugin on top of an unmodified
user-space checkpoint-restart package, DMTCP. Checkpoint-restart is
demonstrated for three virtual machines: Lguest, user-space QEMU, and KVM/QEMU.
The plugins for Lguest and KVM/QEMU require just 200 lines of code. The Lguest
kernel driver API is augmented by 40 lines of code. DMTCP checkpoints
user-space QEMU without any new code. KVM/QEMU, user-space QEMU, and DMTCP need
no modification. The design benefits from other DMTCP features and plugins.
Experiments demonstrate checkpoint and restart in 0.2 seconds using forked
checkpointing, mmap-based fast-restart, and incremental Btrfs-based snapshots
Checkpointing as a Service in Heterogeneous Cloud Environments
A non-invasive, cloud-agnostic approach is demonstrated for extending
existing cloud platforms to include checkpoint-restart capability. Most cloud
platforms currently rely on each application to provide its own fault
tolerance. A uniform mechanism within the cloud itself serves two purposes: (a)
direct support for long-running jobs, which would otherwise require a custom
fault-tolerant mechanism for each application; and (b) the administrative
capability to manage an over-subscribed cloud by temporarily swapping out jobs
when higher priority jobs arrive. An advantage of this uniform approach is that
it also supports parallel and distributed computations, over both TCP and
InfiniBand, thus allowing traditional HPC applications to take advantage of an
existing cloud infrastructure. Additionally, an integrated health-monitoring
mechanism detects when long-running jobs either fail or incur exceptionally low
performance, perhaps due to resource starvation, and proactively suspends the
job. The cloud-agnostic feature is demonstrated by applying the implementation
to two very different cloud platforms: Snooze and OpenStack. The use of a
cloud-agnostic architecture also enables, for the first time, migration of
applications from one cloud platform to another.Comment: 20 pages, 11 figures, appears in CCGrid, 201
Reliable Provisioning of Spot Instances for Compute-intensive Applications
Cloud computing providers are now offering their unused resources for leasing
in the spot market, which has been considered the first step towards a
full-fledged market economy for computational resources. Spot instances are
virtual machines (VMs) available at lower prices than their standard on-demand
counterparts. These VMs will run for as long as the current price is lower than
the maximum bid price users are willing to pay per hour. Spot instances have
been increasingly used for executing compute-intensive applications. In spite
of an apparent economical advantage, due to an intermittent nature of biddable
resources, application execution times may be prolonged or they may not finish
at all. This paper proposes a resource allocation strategy that addresses the
problem of running compute-intensive jobs on a pool of intermittent virtual
machines, while also aiming to run applications in a fast and economical way.
To mitigate potential unavailability periods, a multifaceted fault-aware
resource provisioning policy is proposed. Our solution employs price and
runtime estimation mechanisms, as well as three fault tolerance techniques,
namely checkpointing, task duplication and migration. We evaluate our
strategies using trace-driven simulations, which take as input real price
variation traces, as well as an application trace from the Parallel Workload
Archive. Our results demonstrate the effectiveness of executing applications on
spot instances, respecting QoS constraints, despite occasional failures.Comment: 8 pages, 4 figure
Transparent live migration of container deployments in userspace
En aquesta tèsis de Màster, presentem una eina per realitzar migracions de contenidors tipus runC emprant CRIU. La nostre solució és eficient en termes d utilització de recursos, memòria i disc, i minimitza el temps de migració quan comparada amb una migració basada en capturar-transferir-reiniciar i amb la migració nativa de màquines virtuals oferida pels seus proveı̈dors. En afegit, la nostra eina permet migrar aplicacions que fan ús intensiu tant de memòria com de xarxa, amb connexions TCP establertes, i namespaces externs. La implementació està acompanyada d una recerca bibliogràfica en profunditat, aixı́ com d una sèrie d experiments que motiven els nostres criteris de disseny. El codi és de lliure accés i es pot trobar a la pàgina web del projecte
- …