551 research outputs found

    Mobile Computing in Physics Analysis - An Indicator for eScience

    Full text link
    This paper presents the design and implementation of a Grid-enabled physics analysis environment for handheld and other resource-limited computing devices as one example of the use of mobile devices in eScience. Handheld devices offer great potential because they provide ubiquitous access to data and round-the-clock connectivity over wireless links. Our solution aims to provide users of handheld devices the capability to launch heavy computational tasks on computational and data Grids, monitor the jobs status during execution, and retrieve results after job completion. Users carry their jobs on their handheld devices in the form of executables (and associated libraries). Users can transparently view the status of their jobs and get back their outputs without having to know where they are being executed. In this way, our system is able to act as a high-throughput computing environment where devices ranging from powerful desktop machines to small handhelds can employ the power of the Grid. The results shown in this paper are readily applicable to the wider eScience community.Comment: 8 pages, 7 figures. Presented at the 3rd Int Conf on Mobile Computing & Ubiquitous Networking (ICMU06. London October 200

    Landscape of IoT security

    Full text link
    The last two decades have experienced a steady rise in the production and deployment of sensing-and-connectivity-enabled electronic devices, replacing “regular” physical objects. The resulting Internet-of-Things (IoT) will soon become indispensable for many application domains. Smart objects are continuously being integrated within factories, cities, buildings, health institutions, and private homes. Approximately 30 years after the birth of IoT, society is confronted with significant challenges regarding IoT security. Due to the interconnectivity and ubiquitous use of IoT devices, cyberattacks have widespread impacts on multiple stakeholders. Past events show that the IoT domain holds various vulnerabilities, exploited to generate physical, economic, and health damage. Despite many of these threats, manufacturers struggle to secure IoT devices properly. Thus, this work overviews the IoT security landscape with the intention to emphasize the demand for secured IoT-related products and applications. Therefore, (a) a list of key challenges of securing IoT devices is determined by examining their particular characteristics, (b) major security objectives for secured IoT systems are defined, (c) a threat taxonomy is introduced, which outlines potential security gaps prevalent in current IoT systems, and (d) key countermeasures against the aforementioned threats are summarized for selected IoT security-related technologies available on the market

    Toward a Live BBU Container Migration in Wireless Networks

    Get PDF
    Cloud Radio Access Networks (Cloud-RANs) have recently emerged as a promising architecture to meet the increasing demands and expectations of future wireless networks. Such an architecture can enable dynamic and flexible network operations to address significant challenges, such as higher mobile traffic volumes and increasing network operation costs. However, the implementation of compute-intensive signal processing Network Functions (NFs) on the General Purpose Processors (General Purpose Processors) that are typically found in data centers could lead to performance complications, such as in the case of overloaded servers. There is therefore a need for methods that ensure the availability and continuity of critical wireless network functionality in such circumstances. Motivated by the goal of providing highly available and fault-tolerant functionality in Cloud-RAN-based networks, this paper proposes the design, specification, and implementation of live migration of containerized Baseband Units (BBUs) in two wireless network settings, namely Long Range Wide Area Network (LoRaWAN) and Long Term Evolution (LTE) networks. Driven by the requirements and critical challenges of live migration, the approach shows that in the case of LoRaWAN networks, the migration of BBUs is currently possible with relatively low downtimes to support network continuity. The analysis and comparison of the performance of functional splits and cell configurations in both networks were performed in terms of fronthaul throughput requirements. The results obtained from such an analysis can be used by both service providers and network operators in the deployment and optimization of Cloud-RANs services, in order to ensure network reliability and continuity in cloud environments

    Automatic Code Placement Alternatives for Ad-Hoc And Sensor Networks

    Full text link
    Developing applications for ad-hoc and sensor networks poses significant challenges. Many interesting applications in these domains entail collaboration between components distributed throughout an ad-hoc network. Defining these components, optimally placing them on nodes in the ad-hoc network and relocating them in response to changes is a fundamental problem faced by such applications. Manual approaches to code and data migration are not only platform-dependent and error-prone, but also needlessly complicate application development. Further, locally optimal decisions made by applications that share the same network can lead to globally unstable and energy inefficient behavior. In this paper we describe the design and implementation of a distributed operating system for ad-hoc and sensor networks whose goal is to enable power-aware, adaptive, and easy-to-develop ad-hoc networking applications. Our system achieves this goal by providing a single system image of a unified Java virtual machine to applications over an ad-hoc collection of heterogeneous nodes. It automatically and transparently partitions applications into components and dynamically finds a placement of these components on nodes within the ad-hoc network to reduce energy consumption and increase system longevity. This paper outlines the design of our system and evaluates two practical, power-aware, online algorithms for object placement that form the core of our system. We demonstrate that our algorithms can increase system longevity by a factor of four to five by effectively distributing energy consumption, and are suitable for use in an energy efficient operating system in which applications are distributed automatically and transparently

    A Recovery Scheme for Cluster Federations Using Sender-based Message Logging

    Get PDF
    A cluster federation is a union of clusters and is heterogeneous. Each cluster contains a certain number of processes. An application running in such a computing environment is divided into communicating modules so that these modules can run on different clusters. To achieve fault-tolerance different clusters may employ different check pointing schemes. For example, some may use coordinated schemes, while some other may use communication-induced schemes. It may complicate the recovery process. In this paper, we have addressed the complex problem of recovery for cluster computing environment. The proposed approach handles both inter cluster orphan and lost messages unlike the existing works in this area. We first propose an algorithm to determine a recovery line so that there does not exist any inter cluster orphan message between any pair of the cluster level check points belonging to the recovery line. The main feature of the proposed algorithm is that it can be executed simultaneously by all clusters in the cluster federation. Next we apply the sender-based message logging idea to effectively handle all inter cluster lost messages to ensure correctness of computation

    Checkpoint-based Fault-tolerance for LEACH Protocol

    No full text
    International audienceMost routing protocols designed for wireless sensor networks provide good results in ideal environments. However, their performance degrades dramatically when nodes stop working for various causes such as loss of energy, crushed by animal or climatic conditions. In this paper, we highlight the weaknesses of LEACH (Low Energy Adaptive Clustering Hierarchy) protocol by evaluating its performance. Then we propose an improved version of this protocol based on checkpoint approach that allows it to become a fault-tolerant protocol. Finally, several simulations were conducted to illustrate the benefits of our contribution

    Reliability for exascale computing : system modelling and error mitigation for task-parallel HPC applications

    Get PDF
    As high performance computing (HPC) systems continue to grow, their fault rate increases. Applications running on these systems have to deal with rates on the order of hours or days. Furthermore, some studies for future Exascale systems predict the rates to be on the order of minutes. As a result, efficient fault tolerance solutions are needed to be able to tolerate frequent failures. A fault tolerance solution for future HPC and Exascale systems must be low-cost, efficient and highly scalable. It should have low overhead in fault-free execution and provide fast restart because long-running applications are expected to experience many faults during the execution. Meanwhile task-based dataflow parallel programming models (PM) are becoming a popular paradigm in HPC applications at large scale. For instance, we see the adaptation of task-based dataflow parallelism in OpenMP 4.0, OmpSs PM, Argobots and Intel Threading Building Blocks. In this thesis we propose fault-tolerance solutions for task-parallel dataflow HPC applications. Specifically, first we design and implement a checkpoint/restart and message-logging framework to recover from errors. We then develop performance models to investigate the benefits of our task-level frameworks when integrated with system-wide checkpointing. Moreover, we design and implement selective task replication mechanisms to detect and recover from silent data corruptions in task-parallel dataflow HPC applications. Finally, we introduce a runtime-based coding scheme to detect and recover from memory errors in these applications. Considering the span of all of our schemes, we see that they provide a fairly high failure coverage where both computation and memory is protected against errors.A medida que los Sistemas de Cómputo de Alto rendimiento (HPC por sus siglas en inglés) siguen creciendo, también las tasas de fallos aumentan. Las aplicaciones que se ejecutan en estos sistemas tienen una tasa de fallos que pueden estar en el orden de horas o días. Además, algunos estudios predicen que los fallos estarán en el orden de minutos en los Sistemas Exascale. Por lo tanto, son necesarias soluciones eficientes para la tolerancia a fallos que puedan tolerar fallos frecuentes. Las soluciones para tolerancia a fallos en los Sistemas futuros de HPC y Exascale tienen que ser de bajo costo, eficientes y altamente escalable. El sobrecosto en la ejecución sin fallos debe ser bajo y también se debe proporcionar reinicio rápido, ya que se espera que las aplicaciones de larga duración experimenten muchos fallos durante la ejecución. Por otra parte, los modelos de programación paralelas basados en tareas ordenadas de acuerdo a sus dependencias de datos, se están convirtiendo en un paradigma popular en aplicaciones HPC a gran escala. Por ejemplo, los siguientes modelos de programación paralela incluyen este tipo de modelo de programación OpenMP 4.0, OmpSs, Argobots e Intel Threading Building Blocks. En esta tesis proponemos soluciones de tolerancia a fallos para aplicaciones de HPC programadas en un modelo de programación paralelo basado tareas. Específicamente, en primer lugar, diseñamos e implementamos mecanismos “checkpoint/restart” y “message-logging” para recuperarse de los errores. Para investigar los beneficios de nuestras herramientas a nivel de tarea cuando se integra con los “system-wide checkpointing” se han desarrollado modelos de rendimiento. Por otra parte, diseñamos e implementamos mecanismos de replicación selectiva de tareas que permiten detectar y recuperarse de daños de datos silenciosos en aplicaciones programadas siguiendo el modelo de programación paralela basadas en tareas. Por último, se introduce un esquema de codificación que funciona en tiempo de ejecución para detectar y recuperarse de los errores de la memoria en estas aplicaciones. Todos los esquemas propuestos, en conjunto, proporcionan una cobertura bastante alta a los fallos tanto si estos se producen el cálculo o en la memoria.Postprint (published version
    corecore