76,771 research outputs found

    Evaluation of Docker Containers for Scientific Workloads in the Cloud

    Full text link
    The HPC community is actively researching and evaluating tools to support execution of scientific applications in cloud-based environments. Among the various technologies, containers have recently gained importance as they have significantly better performance compared to full-scale virtualization, support for microservices and DevOps, and work seamlessly with workflow and orchestration tools. Docker is currently the leader in containerization technology because it offers low overhead, flexibility, portability of applications, and reproducibility. Singularity is another container solution that is of interest as it is designed specifically for scientific applications. It is important to conduct performance and feature analysis of the container technologies to understand their applicability for each application and target execution environment. This paper presents a (1) performance evaluation of Docker and Singularity on bare metal nodes in the Chameleon cloud (2) mechanism by which Docker containers can be mapped with InfiniBand hardware with RDMA communication and (3) analysis of mapping elements of parallel workloads to the containers for optimal resource management with container-ready orchestration tools. Our experiments are targeted toward application developers so that they can make informed decisions on choosing the container technologies and approaches that are suitable for their HPC workloads on cloud infrastructure. Our performance analysis shows that scientific workloads for both Docker and Singularity based containers can achieve near-native performance. Singularity is designed specifically for HPC workloads. However, Docker still has advantages over Singularity for use in clouds as it provides overlay networking and an intuitive way to run MPI applications with one container per rank for fine-grained resources allocation

    Developing a Community of Practice for Applied Uses of Future PACE Data to Address Marine Food Security Challenges

    Get PDF
    External interaction:The Plankton, Aerosol, Cloud, ocean Ecosystem (PACE) mission will include a hyperspectral imaging radiometer to advance ecosystem monitoring beyond heritage retrievals of the concentration of surface chlorophyll and other traditional ocean color variables, offering potential for novel science and applications. PACE is the first NASA ocean color mission to occur under the agency's new and evolving effort to directly engage practical end users prior to satellite launch to increase adoption of this freely available data toward societal challenges. Here we describe early efforts to engage a community of practice around marine food-related resource management, business decisions, and policy analysis. Obviously one satellite cannot meet diverse end user needs at all scales and locations, but understanding downstream needs helps in the assessment of information gaps and planning how to optimize the unique strengths of PACE data in combination with the strengths of other satellite retrievals, in situ measurements, and models. Higher spectral resolution data from PACE can be fused with information from satellites with higher spatial or temporal resolution, plus other information, to enable identification and tracking of new marine biological indicators to guide sustainable management. Accounting for the needs of applied researchers as well as non-traditional users of satellite data early in the PACE mission process will ultimately serve to broaden the base of informed users and facilitate faster adoption of the most advanced science and technology toward the challenge of mitigating food insecurity

    End-to-end informed VM selection in compute clouds

    Full text link
    The selection of resources, particularly VMs, in current public IaaS clouds is usually done in a blind fashion, as cloud users do not have much information about resource consumption by co-tenant third-party tasks. In particular, communication patterns can play a significant part in cloud application performance and responsiveness, specially in the case of novel latencysensitive applications, increasingly common in today’s clouds. Thus, herein we propose an end-to-end approach to the VM allocation problem using policies based uniquely on round-trip time measurements between VMs. Those become part of a userlevel ‘Recommender Service’ that receives VM allocation requests with certain network-related demands and matches them to a suitable subset of VMs available to the user within the cloud. We propose and implement end-to-end algorithms for VM selection that cover desirable profiles of communications between VMs in distributed applications in a cloud setting, such as profiles with prevailing pair-wise, hub-and-spokes, or clustered communication patterns between constituent VMs. We quantify the expected benefits from deploying our Recommender Service by comparing our informed VM allocation approaches to conventional, random allocation methods, based on real measurements of latencies between Amazon EC2 instances. We also show that our approach is completely independent from cloud architecture details, is adaptable to different types of applications and workloads, and is lightweight and transparent to cloud providers.This work is supported in part by the National Science Foundation under grant CNS-0963974

    Named data networking for efficient IoT-based disaster management in a smart campus

    Get PDF
    Disasters are uncertain occasions that can impose a drastic impact on human life and building infrastructures. Information and Communication Technology (ICT) plays a vital role in coping with such situations by enabling and integrating multiple technological resources to develop Disaster Management Systems (DMSs). In this context, a majority of the existing DMSs use networking architectures based upon the Internet Protocol (IP) focusing on location-dependent communications. However, IP-based communications face the limitations of inefficient bandwidth utilization, high processing, data security, and excessive memory intake. To address these issues, Named Data Networking (NDN) has emerged as a promising communication paradigm, which is based on the Information-Centric Networking (ICN) architecture. An NDN is among the self-organizing communication networks that reduces the complexity of networking systems in addition to provide content security. Given this, many NDN-based DMSs have been proposed. The problem with the existing NDN-based DMS is that they use a PULL-based mechanism that ultimately results in higher delay and more energy consumption. In order to cater for time-critical scenarios, emergence-driven network engineering communication and computation models are required. In this paper, a novel DMS is proposed, i.e., Named Data Networking Disaster Management (NDN-DM), where a producer forwards a fire alert message to neighbouring consumers. This makes the nodes converge according to the disaster situation in a more efficient and secure way. Furthermore, we consider a fire scenario in a university campus and mobile nodes in the campus collaborate with each other to manage the fire situation. The proposed framework has been mathematically modeled and formally proved using timed automata-based transition systems and a real-time model checker, respectively. Additionally, the evaluation of the proposed NDM-DM has been performed using NS2. The results prove that the proposed scheme has reduced the end-to-end delay up from 2% to 10% and minimized up to 20% energy consumption, as energy improved from 3% to 20% compared with a state-of-the-art NDN-based DMS

    Enhancing reuse of data and biological material in medical research : from FAIR to FAIR-Health

    Get PDF
    The known challenge of underutilization of data and biological material from biorepositories as potential resources formedical research has been the focus of discussion for over a decade. Recently developed guidelines for improved data availability and reusability—entitled FAIR Principles (Findability, Accessibility, Interoperability, and Reusability)—are likely to address only parts of the problem. In this article,we argue that biologicalmaterial and data should be viewed as a unified resource. This approach would facilitate access to complete provenance information, which is a prerequisite for reproducibility and meaningful integration of the data. A unified view also allows for optimization of long-term storage strategies, as demonstrated in the case of biobanks.Wepropose an extension of the FAIR Principles to include the following additional components: (1) quality aspects related to research reproducibility and meaningful reuse of the data, (2) incentives to stimulate effective enrichment of data sets and biological material collections and its reuse on all levels, and (3) privacy-respecting approaches for working with the human material and data. These FAIR-Health principles should then be applied to both the biological material and data. We also propose the development of common guidelines for cloud architectures, due to the unprecedented growth of volume and breadth of medical data generation, as well as the associated need to process the data efficiently.peer-reviewe

    Applications and Challenges of Real-time Mobile DNA Analysis

    Full text link
    The DNA sequencing is the process of identifying the exact order of nucleotides within a given DNA molecule. The new portable and relatively inexpensive DNA sequencers, such as Oxford Nanopore MinION, have the potential to move DNA sequencing outside of laboratory, leading to faster and more accessible DNA-based diagnostics. However, portable DNA sequencing and analysis are challenging for mobile systems, owing to high data throughputs and computationally intensive processing performed in environments with unreliable connectivity and power. In this paper, we provide an analysis of the challenges that mobile systems and mobile computing must address to maximize the potential of portable DNA sequencing, and in situ DNA analysis. We explain the DNA sequencing process and highlight the main differences between traditional and portable DNA sequencing in the context of the actual and envisioned applications. We look at the identified challenges from the perspective of both algorithms and systems design, showing the need for careful co-design
    corecore