38 research outputs found
Simurgh: a fully decentralized and secure NVMM user space file system
The availability of non-volatile main memory (NVMM) has started a new era for storage systems and NVMM specific file systems can support extremely high data and metadata rates, which are required by many HPC and data-intensive applications. Scaling metadata performance within NVMM file systems is nevertheless often restricted by the Linux kernel storage stack, while simply moving metadata management to the user space can compromise security or flexibility. This paper introduces Simurgh, a hardware-assisted user space file system with decentralized metadata management that allows secure metadata updates from within user space. Simurgh guarantees consistency, durability, and ordering of updates without sacrificing scalability. Security is enforced by only allowing NVMM access from protected user space functions, which can be implemented through two proposed instructions. Comparisons with other NVMM file systems show that Simurgh improves metadata performance up to 18x and application performance up to 89% compared to the second-fastest file system.This work has been supported by the European Comission’s BigStorage project H2020-MSCA-ITN2014-642963. It is also supported by the Big Data in Atmospheric Physics (BINARY) project, funded by the Carl Zeiss Foundation under Grant No.: P2018-02-003.Peer ReviewedPostprint (author's final draft
CoRD: Converged RDMA Dataplane for High-Performance Clouds
High-performance networking is often characterized by kernel bypass which is
considered mandatory in high-performance parallel and distributed applications.
But kernel bypass comes at a price because it breaks the traditional OS
architecture, requiring applications to use special APIs and limiting the OS
control over existing network connections. We make the case, that kernel bypass
is not mandatory. Rather, high-performance networking relies on multiple
performance-improving techniques, with kernel bypass being the least effective.
CoRD removes kernel bypass from RDMA networks, enabling efficient OS-level
control over RDMA dataplane.Comment: 11 page
Assise: Performance and Availability via NVM Colocation in a Distributed File System
The adoption of very low latency persistent memory modules (PMMs) upends the
long-established model of disaggregated file system access. Instead, by
colocating computation and PMM storage, we can provide applications much higher
I/O performance, sub-second application failover, and strong consistency. To
demonstrate this, we built the Assise distributed file system, based on a
persistent, replicated coherence protocol for managing a set of
server-colocated PMMs as a fast, crash-recoverable cache between applications
and slower disaggregated storage, such as SSDs. Unlike disaggregated file
systems, Assise maximizes locality for all file IO by carrying out IO on
colocated PMM whenever possible and minimizes coherence overhead by maintaining
consistency at IO operation granularity, rather than at fixed block sizes.
We compare Assise to Ceph/Bluestore, NFS, and Octopus on a cluster with Intel
Optane DC PMMs and SSDs for common cloud applications and benchmarks, such as
LevelDB, Postfix, and FileBench. We find that Assise improves write latency up
to 22x, throughput up to 56x, fail-over time up to 103x, and scales up to 6x
better than its counterparts, while providing stronger consistency semantics.
Assise promises to beat the MinuteSort world record by 1.5x
Active-Learning-as-a-Service: An Efficient MLOps System for Data-Centric AI
The success of today's AI applications requires not only model training
(Model-centric) but also data engineering (Data-centric). In data-centric AI,
active learning (AL) plays a vital role, but current AL tools can not perform
AL tasks efficiently. To this end, this paper presents an efficient MLOps
system for AL, named ALaaS (Active-Learning-as-a-Service). Specifically, ALaaS
adopts a server-client architecture to support an AL pipeline and implements
stage-level parallelism for high efficiency. Meanwhile, caching and batching
techniques are employed to further accelerate the AL process. In addition to
efficiency, ALaaS ensures accessibility with the help of the design philosophy
of configuration-as-a-service. It also abstracts an AL process to several
components and provides rich APIs for advanced users to extend the system to
new scenarios. Extensive experiments show that ALaaS outperforms all other
baselines in terms of latency and throughput. Further ablation studies
demonstrate the effectiveness of our design as well as ALaaS's ease to use. Our
code is available at \url{https://github.com/MLSysOps/alaas}.Comment: 8 pages, 7 figure
DynaComm: Accelerating Distributed CNN Training between Edges and Clouds through Dynamic Communication Scheduling
To reduce uploading bandwidth and address privacy concerns, deep learning at
the network edge has been an emerging topic. Typically, edge devices
collaboratively train a shared model using real-time generated data through the
Parameter Server framework. Although all the edge devices can share the
computing workloads, the distributed training processes over edge networks are
still time-consuming due to the parameters and gradients transmission
procedures between parameter servers and edge devices. Focusing on accelerating
distributed Convolutional Neural Networks (CNNs) training at the network edge,
we present DynaComm, a novel scheduler that dynamically decomposes each
transmission procedure into several segments to achieve optimal layer-wise
communications and computations overlapping during run-time. Through
experiments, we verify that DynaComm manages to achieve optimal layer-wise
scheduling for all cases compared to competing strategies while the model
accuracy remains untouched.Comment: 16 pages, 12 figures. IEEE Journal on Selected Areas in
Communication