76,771 research outputs found
Evaluation of Docker Containers for Scientific Workloads in the Cloud
The HPC community is actively researching and evaluating tools to support
execution of scientific applications in cloud-based environments. Among the
various technologies, containers have recently gained importance as they have
significantly better performance compared to full-scale virtualization, support
for microservices and DevOps, and work seamlessly with workflow and
orchestration tools. Docker is currently the leader in containerization
technology because it offers low overhead, flexibility, portability of
applications, and reproducibility. Singularity is another container solution
that is of interest as it is designed specifically for scientific applications.
It is important to conduct performance and feature analysis of the container
technologies to understand their applicability for each application and target
execution environment. This paper presents a (1) performance evaluation of
Docker and Singularity on bare metal nodes in the Chameleon cloud (2) mechanism
by which Docker containers can be mapped with InfiniBand hardware with RDMA
communication and (3) analysis of mapping elements of parallel workloads to the
containers for optimal resource management with container-ready orchestration
tools. Our experiments are targeted toward application developers so that they
can make informed decisions on choosing the container technologies and
approaches that are suitable for their HPC workloads on cloud infrastructure.
Our performance analysis shows that scientific workloads for both Docker and
Singularity based containers can achieve near-native performance. Singularity
is designed specifically for HPC workloads. However, Docker still has
advantages over Singularity for use in clouds as it provides overlay networking
and an intuitive way to run MPI applications with one container per rank for
fine-grained resources allocation
Developing a Community of Practice for Applied Uses of Future PACE Data to Address Marine Food Security Challenges
External interaction:The Plankton, Aerosol, Cloud, ocean Ecosystem (PACE) mission will include a hyperspectral imaging radiometer to advance ecosystem monitoring beyond heritage retrievals of the concentration of surface chlorophyll and other traditional ocean color variables, offering potential for novel science and applications. PACE is the first NASA ocean color mission to occur under the agency's new and evolving effort to directly engage practical end users prior to satellite launch to increase adoption of this freely available data toward societal challenges. Here we describe early efforts to engage a community of practice around marine food-related resource management, business decisions, and policy analysis. Obviously one satellite cannot meet diverse end user needs at all scales and locations, but understanding downstream needs helps in the assessment of information gaps and planning how to optimize the unique strengths of PACE data in combination with the strengths of other satellite retrievals, in situ measurements, and models. Higher spectral resolution data from PACE can be fused with information from satellites with higher spatial or temporal resolution, plus other information, to enable identification and tracking of new marine biological indicators to guide sustainable management. Accounting for the needs of applied researchers as well as non-traditional users of satellite data early in the PACE mission process will ultimately serve to broaden the base of informed users and facilitate faster adoption of the most advanced science and technology toward the challenge of mitigating food insecurity
End-to-end informed VM selection in compute clouds
The selection of resources, particularly VMs, in current public IaaS clouds is usually done in a blind fashion, as cloud users do not have much information about resource consumption by co-tenant third-party tasks. In particular, communication patterns can play a significant part in cloud application performance and responsiveness, specially in the case of novel latencysensitive applications, increasingly common in today’s clouds. Thus, herein we propose an end-to-end approach to the VM allocation problem using policies based uniquely on round-trip time measurements between VMs. Those become part of a userlevel ‘Recommender Service’ that receives VM allocation requests with certain network-related demands and matches them to a suitable subset of VMs available to the user within the cloud. We propose and implement end-to-end algorithms for VM selection that cover desirable profiles of communications between VMs in distributed applications in a cloud setting, such as profiles with prevailing pair-wise, hub-and-spokes, or clustered communication patterns between constituent VMs. We quantify the expected benefits from deploying our Recommender Service by comparing our informed VM allocation approaches to conventional, random allocation methods, based on real measurements of latencies between Amazon EC2 instances. We also show that our approach is completely independent from cloud architecture details, is adaptable to different types of applications and workloads, and is lightweight and transparent to cloud providers.This work is supported in part by the National Science
Foundation under grant CNS-0963974
Named data networking for efficient IoT-based disaster management in a smart campus
Disasters are uncertain occasions that can impose a drastic impact on human life and building infrastructures. Information and Communication Technology (ICT) plays a vital role in coping with such situations by enabling and integrating multiple technological resources to develop Disaster Management Systems (DMSs). In this context, a majority of the existing DMSs use networking architectures based upon the Internet Protocol (IP) focusing on location-dependent communications. However, IP-based communications face the limitations of inefficient bandwidth utilization, high processing, data security, and excessive memory intake. To address these issues, Named Data Networking (NDN) has emerged as a promising communication paradigm, which is based on the Information-Centric Networking (ICN) architecture. An NDN is among the self-organizing communication networks that reduces the complexity of networking systems in addition to provide content security. Given this, many NDN-based DMSs have been proposed. The problem with the existing NDN-based DMS is that they use a PULL-based mechanism that ultimately results in higher delay and more energy consumption. In order to cater for time-critical scenarios, emergence-driven network engineering communication and computation models are required. In this paper, a novel DMS is proposed, i.e., Named Data Networking Disaster Management (NDN-DM), where a producer forwards a fire alert message to neighbouring consumers. This makes the nodes converge according to the disaster situation in a more efficient and secure way. Furthermore, we consider a fire scenario in a university campus and mobile nodes in the campus collaborate with each other to manage the fire situation. The proposed framework has been mathematically modeled and formally proved using timed automata-based transition systems and a real-time model checker, respectively. Additionally, the evaluation of the proposed NDM-DM has been performed using NS2. The results prove that the proposed scheme has reduced the end-to-end delay up from 2% to 10% and minimized up to 20% energy consumption, as energy improved from 3% to 20% compared with a state-of-the-art NDN-based DMS
Enhancing reuse of data and biological material in medical research : from FAIR to FAIR-Health
The known challenge of underutilization of data and biological material from biorepositories as potential resources
formedical research has been the focus of discussion for over a decade. Recently developed guidelines for improved
data availability and reusability—entitled FAIR Principles (Findability, Accessibility, Interoperability, and
Reusability)—are likely to address only parts of the problem. In this article,we argue that biologicalmaterial and data
should be viewed as a unified resource. This approach would facilitate access to complete provenance information,
which is a prerequisite for reproducibility and meaningful integration of the data. A unified view also allows for
optimization of long-term storage strategies, as demonstrated in the case of biobanks.Wepropose an extension of the
FAIR Principles to include the following additional components: (1) quality aspects related to research reproducibility
and meaningful reuse of the data, (2) incentives to stimulate effective enrichment of data sets and biological
material collections and its reuse on all levels, and (3) privacy-respecting approaches for working with the human
material and data. These FAIR-Health principles should then be applied to both the biological material and data. We
also propose the development of common guidelines for cloud architectures, due to the unprecedented growth of
volume and breadth of medical data generation, as well as the associated need to process the data efficiently.peer-reviewe
Applications and Challenges of Real-time Mobile DNA Analysis
The DNA sequencing is the process of identifying the exact order of
nucleotides within a given DNA molecule. The new portable and relatively
inexpensive DNA sequencers, such as Oxford Nanopore MinION, have the potential
to move DNA sequencing outside of laboratory, leading to faster and more
accessible DNA-based diagnostics. However, portable DNA sequencing and analysis
are challenging for mobile systems, owing to high data throughputs and
computationally intensive processing performed in environments with unreliable
connectivity and power.
In this paper, we provide an analysis of the challenges that mobile systems
and mobile computing must address to maximize the potential of portable DNA
sequencing, and in situ DNA analysis. We explain the DNA sequencing process and
highlight the main differences between traditional and portable DNA sequencing
in the context of the actual and envisioned applications. We look at the
identified challenges from the perspective of both algorithms and systems
design, showing the need for careful co-design
- …