247,362 research outputs found

    Checkpointing as a Service in Heterogeneous Cloud Environments

    Get PDF
    A non-invasive, cloud-agnostic approach is demonstrated for extending existing cloud platforms to include checkpoint-restart capability. Most cloud platforms currently rely on each application to provide its own fault tolerance. A uniform mechanism within the cloud itself serves two purposes: (a) direct support for long-running jobs, which would otherwise require a custom fault-tolerant mechanism for each application; and (b) the administrative capability to manage an over-subscribed cloud by temporarily swapping out jobs when higher priority jobs arrive. An advantage of this uniform approach is that it also supports parallel and distributed computations, over both TCP and InfiniBand, thus allowing traditional HPC applications to take advantage of an existing cloud infrastructure. Additionally, an integrated health-monitoring mechanism detects when long-running jobs either fail or incur exceptionally low performance, perhaps due to resource starvation, and proactively suspends the job. The cloud-agnostic feature is demonstrated by applying the implementation to two very different cloud platforms: Snooze and OpenStack. The use of a cloud-agnostic architecture also enables, for the first time, migration of applications from one cloud platform to another.Comment: 20 pages, 11 figures, appears in CCGrid, 201

    Wireless industrial monitoring and control networks: the journey so far and the road ahead

    Get PDF
    While traditional wired communication technologies have played a crucial role in industrial monitoring and control networks over the past few decades, they are increasingly proving to be inadequate to meet the highly dynamic and stringent demands of today’s industrial applications, primarily due to the very rigid nature of wired infrastructures. Wireless technology, however, through its increased pervasiveness, has the potential to revolutionize the industry, not only by mitigating the problems faced by wired solutions, but also by introducing a completely new class of applications. While present day wireless technologies made some preliminary inroads in the monitoring domain, they still have severe limitations especially when real-time, reliable distributed control operations are concerned. This article provides the reader with an overview of existing wireless technologies commonly used in the monitoring and control industry. It highlights the pros and cons of each technology and assesses the degree to which each technology is able to meet the stringent demands of industrial monitoring and control networks. Additionally, it summarizes mechanisms proposed by academia, especially serving critical applications by addressing the real-time and reliability requirements of industrial process automation. The article also describes certain key research problems from the physical layer communication for sensor networks and the wireless networking perspective that have yet to be addressed to allow the successful use of wireless technologies in industrial monitoring and control networks
    • …
    corecore