3,687 research outputs found
Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency
Persistent memory provides high-performance data persistence at main memory.
Memory writes need to be performed in strict order to satisfy storage
consistency requirements and enable correct recovery from system crashes.
Unfortunately, adhering to such a strict order significantly degrades system
performance and persistent memory endurance. This paper introduces a new
mechanism, Loose-Ordering Consistency (LOC), that satisfies the ordering
requirements at significantly lower performance and endurance loss. LOC
consists of two key techniques. First, Eager Commit eliminates the need to
perform a persistent commit record write within a transaction. We do so by
ensuring that we can determine the status of all committed transactions during
recovery by storing necessary metadata information statically with blocks of
data written to memory. Second, Speculative Persistence relaxes the write
ordering between transactions by allowing writes to be speculatively written to
persistent memory. A speculative write is made visible to software only after
its associated transaction commits. To enable this, our mechanism supports the
tracking of committed transaction ID and multi-versioning in the CPU cache. Our
evaluations show that LOC reduces the average performance overhead of memory
persistence from 66.9% to 34.9% and the memory write traffic overhead from
17.1% to 3.4% on a variety of workloads.Comment: This paper has been accepted by IEEE Transactions on Parallel and
Distributed System
Towards Exascale Scientific Metadata Management
Advances in technology and computing hardware are enabling scientists from
all areas of science to produce massive amounts of data using large-scale
simulations or observational facilities. In this era of data deluge, effective
coordination between the data production and the analysis phases hinges on the
availability of metadata that describe the scientific datasets. Existing
workflow engines have been capturing a limited form of metadata to provide
provenance information about the identity and lineage of the data. However,
much of the data produced by simulations, experiments, and analyses still need
to be annotated manually in an ad hoc manner by domain scientists. Systematic
and transparent acquisition of rich metadata becomes a crucial prerequisite to
sustain and accelerate the pace of scientific innovation. Yet, ubiquitous and
domain-agnostic metadata management infrastructure that can meet the demands of
extreme-scale science is notable by its absence.
To address this gap in scientific data management research and practice, we
present our vision for an integrated approach that (1) automatically captures
and manipulates information-rich metadata while the data is being produced or
analyzed and (2) stores metadata within each dataset to permeate
metadata-oblivious processes and to query metadata through established and
standardized data access interfaces. We motivate the need for the proposed
integrated approach using applications from plasma physics, climate modeling
and neuroscience, and then discuss research challenges and possible solutions
The evolution of bits and bottlenecks in a scientific workflow trying to keep up with technology: Accelerating 4D image segmentation applied to nasa data
In 2016, a team of earth scientists directly engaged a team of computer scientists to identify cyberinfrastructure (CI) approaches that would speed up an earth science workflow. This paper describes the evolution of that workflow as the two teams bridged CI and an image segmentation algorithm to do large scale earth science research. The Pacific Research Platform (PRP) and The Cognitive Hardware and Software Ecosystem Community Infrastructure (CHASE-CI) resources were used to significantly decreased the earth science workflow's wall-clock time from 19.5 days to 53 minutes. The improvement in wall-clock time comes from the use of network appliances, improved image segmentation, deployment of a containerized workflow, and the increase in CI experience and training for the earth scientists. This paper presents a description of the evolving innovations used to improve the workflow, bottlenecks identified within each workflow version, and improvements made within each version of the workflow, over a three-year time period
The Design and Implementation of a High-Performance Log-Structured RAID System for ZNS SSDs
Zoned Namespace (ZNS) defines a new abstraction for host software to flexibly
manage storage in flash-based SSDs as append-only zones. It also provides a
Zone Append primitive to further boost the write performance of ZNS SSDs by
exploiting intra-zone parallelism. However, making Zone Append effective for
reliable and scalable storage, in the form of a RAID array of multiple ZNS
SSDs, is non-trivial since Zone Append offloads address management to ZNS SSDs
and requires hosts to dedicatedly manage RAID stripes across multiple drives.
We propose ZapRAID, a high-performance log-structured RAID system for ZNS SSDs
by carefully exploiting Zone Append to achieve high write parallelism and
lightweight stripe management. ZapRAID adopts a group-based data layout with a
coarse-grained ordering across multiple groups of stripes, such that it can use
small-size metadata for stripe management on a per-group basis under Zone
Append. It further adopts hybrid data management to simultaneously achieve
intra-zone and inter-zone parallelism through a careful combination of both
Zone Append and Zone Write primitives. We evaluate ZapRAID using
microbenchmarks, trace-driven experiments, and real-application experiments.
Our evaluation results show that ZapRAID achieves high write throughput and
maintains high performance in normal reads, degraded reads, crash recovery, and
full-drive recovery.Comment: 29 page
Multivalency Beats Complexity: A Study on the Cell Uptake of Carbohydrate Functionalized Nanocarriers to Dendritic Cells
Herein, we report the synthesis of carbohydrate and glycodendron structures for dendritic cell targeting, which were subsequently bound to hydroxyethyl starch (HES) nanocapsules prepared by the inverse miniemulsion technique. The uptake of the carbohydrate-functionalized HES nanocapsules into immature human dendritic cells (hDCs) revealed a strong dependence on the used carbohydrate. A multivalent mannose-terminated dendron was found to be far superior in uptake compared to the structurally more complex oligosaccharides used
Comprehensive analysis of normal adjacent to tumor transcriptomes.
Histologically normal tissue adjacent to the tumor (NAT) is commonly used as a control in cancer studies. However, little is known about the transcriptomic profile of NAT, how it is influenced by the tumor, and how the profile compares with non-tumor-bearing tissues. Here, we integrate data from the Genotype-Tissue Expression project and The Cancer Genome Atlas to comprehensively analyze the transcriptomes of healthy, NAT, and tumor tissues in 6506 samples across eight tissues and corresponding tumor types. Our analysis shows that NAT presents a unique intermediate state between healthy and tumor. Differential gene expression and protein-protein interaction analyses reveal altered pathways shared among NATs across tissue types. We characterize a set of 18 genes that are specifically activated in NATs. By applying pathway and tissue composition analyses, we suggest a pan-cancer mechanism of pro-inflammatory signals from the tumor stimulates an inflammatory response in the adjacent endothelium
- …