2,913 research outputs found

    Toward Smart Moving Target Defense for Linux Container Resiliency

    Full text link
    This paper presents ESCAPE, an informed moving target defense mechanism for cloud containers. ESCAPE models the interaction between attackers and their target containers as a "predator searching for a prey" search game. Live migration of Linux-containers (prey) is used to avoid attacks (predator) and failures. The entire process is guided by a novel host-based behavior-monitoring system that seamlessly monitors containers for indications of intrusions and attacks. To evaluate ESCAPE effectiveness, we simulated the attack avoidance process based on a mathematical model mimicking the prey-vs-predator search game. Simulation results show high container survival probabilities with minimal added overhead.Comment: Published version is available on IEEE Xplore at http://ieeexplore.ieee.org/document/779685

    A Generic Checkpoint-Restart Mechanism for Virtual Machines

    Full text link
    It is common today to deploy complex software inside a virtual machine (VM). Snapshots provide rapid deployment, migration between hosts, dependability (fault tolerance), and security (insulating a guest VM from the host). Yet, for each virtual machine, the code for snapshots is laboriously developed on a per-VM basis. This work demonstrates a generic checkpoint-restart mechanism for virtual machines. The mechanism is based on a plugin on top of an unmodified user-space checkpoint-restart package, DMTCP. Checkpoint-restart is demonstrated for three virtual machines: Lguest, user-space QEMU, and KVM/QEMU. The plugins for Lguest and KVM/QEMU require just 200 lines of code. The Lguest kernel driver API is augmented by 40 lines of code. DMTCP checkpoints user-space QEMU without any new code. KVM/QEMU, user-space QEMU, and DMTCP need no modification. The design benefits from other DMTCP features and plugins. Experiments demonstrate checkpoint and restart in 0.2 seconds using forked checkpointing, mmap-based fast-restart, and incremental Btrfs-based snapshots

    Checkpointing as a Service in Heterogeneous Cloud Environments

    Get PDF
    A non-invasive, cloud-agnostic approach is demonstrated for extending existing cloud platforms to include checkpoint-restart capability. Most cloud platforms currently rely on each application to provide its own fault tolerance. A uniform mechanism within the cloud itself serves two purposes: (a) direct support for long-running jobs, which would otherwise require a custom fault-tolerant mechanism for each application; and (b) the administrative capability to manage an over-subscribed cloud by temporarily swapping out jobs when higher priority jobs arrive. An advantage of this uniform approach is that it also supports parallel and distributed computations, over both TCP and InfiniBand, thus allowing traditional HPC applications to take advantage of an existing cloud infrastructure. Additionally, an integrated health-monitoring mechanism detects when long-running jobs either fail or incur exceptionally low performance, perhaps due to resource starvation, and proactively suspends the job. The cloud-agnostic feature is demonstrated by applying the implementation to two very different cloud platforms: Snooze and OpenStack. The use of a cloud-agnostic architecture also enables, for the first time, migration of applications from one cloud platform to another.Comment: 20 pages, 11 figures, appears in CCGrid, 201

    Reliable Provisioning of Spot Instances for Compute-intensive Applications

    Full text link
    Cloud computing providers are now offering their unused resources for leasing in the spot market, which has been considered the first step towards a full-fledged market economy for computational resources. Spot instances are virtual machines (VMs) available at lower prices than their standard on-demand counterparts. These VMs will run for as long as the current price is lower than the maximum bid price users are willing to pay per hour. Spot instances have been increasingly used for executing compute-intensive applications. In spite of an apparent economical advantage, due to an intermittent nature of biddable resources, application execution times may be prolonged or they may not finish at all. This paper proposes a resource allocation strategy that addresses the problem of running compute-intensive jobs on a pool of intermittent virtual machines, while also aiming to run applications in a fast and economical way. To mitigate potential unavailability periods, a multifaceted fault-aware resource provisioning policy is proposed. Our solution employs price and runtime estimation mechanisms, as well as three fault tolerance techniques, namely checkpointing, task duplication and migration. We evaluate our strategies using trace-driven simulations, which take as input real price variation traces, as well as an application trace from the Parallel Workload Archive. Our results demonstrate the effectiveness of executing applications on spot instances, respecting QoS constraints, despite occasional failures.Comment: 8 pages, 4 figure

    Transparent live migration of container deployments in userspace

    Get PDF
    En aquesta tèsis de Màster, presentem una eina per realitzar migracions de contenidors tipus runC emprant CRIU. La nostre solució és eficient en termes d utilització de recursos, memòria i disc, i minimitza el temps de migració quan comparada amb una migració basada en capturar-transferir-reiniciar i amb la migració nativa de màquines virtuals oferida pels seus proveı̈dors. En afegit, la nostra eina permet migrar aplicacions que fan ús intensiu tant de memòria com de xarxa, amb connexions TCP establertes, i namespaces externs. La implementació està acompanyada d una recerca bibliogràfica en profunditat, aixı́ com d una sèrie d experiments que motiven els nostres criteris de disseny. El codi és de lliure accés i es pot trobar a la pàgina web del projecte
    corecore