243 research outputs found
FOS: A Modular FPGA Operating System for Dynamic Workloads
With FPGAs now being deployed in the cloud and at the edge, there is a need
for scalable design methods which can incorporate the heterogeneity present in
the hardware and software components of FPGA systems. Moreover, these FPGA
systems need to be maintainable and adaptable to changing workloads while
improving accessibility for the application developers. However, current FPGA
systems fail to achieve modularity and support for multi-tenancy due to
dependencies between system components and lack of standardised abstraction
layers. To solve this, we introduce a modular FPGA operating system -- FOS,
which adopts a modular FPGA development flow to allow each system component to
be changed and be agnostic to the heterogeneity of EDA tool versions, hardware
and software layers. Further, to dynamically maximise the utilisation
transparently from the users, FOS employs resource-elastic scheduling to
arbitrate the FPGA resources in both time and spatial domain for any type of
accelerators. Our evaluation on different FPGA boards shows that FOS can
provide performance improvements in both single-tenant and multi-tenant
environments while substantially reducing the development time and, at the same
time, improving flexibility
No DNN Left Behind: Improving Inference in the Cloud with Multi-Tenancy
With the rise of machine learning, inference on deep neural networks (DNNs) has become a core building block on the critical path for many cloud applications. Applications today rely on isolated ad-hoc deployments that force users to compromise on consistent latency, elasticity, or cost-efficiency, depending on workload characteristics. We propose to elevate DNN inference to be a first class cloud primitive provided by a shared multi-tenant system, akin to cloud storage, and cloud databases. A shared system enables cost-efficient operation with consistent performance across the full spectrum of workloads. We argue that DNN inference is an ideal candidate for a multi-tenant system because of its narrow and well-defined interface and predictable resource requirements
I/O Schedulers for Proportionality and Stability on Flash-Based SSDs in Multi-Tenant Environments
The use of flash based Solid State Drives (SSDs) has expanded rapidly into the cloud computing environment. In cloud computing, ensuring the service level objective (SLO) of each server is the major criterion in designing a system. In particular, eliminating performance interference among virtual machines (VMs) on shared storage is a key challenge. However, studies on SSD performance to guarantee SLO in such environments are limited. In this paper, we present analysis of I/O behavior for a shared SSD as storage in terms of proportionality and stability. We show that performance SLOs of SSD based storage systems being shared by VMs or tasks are not satisfactory. We present and analyze the reasons behind the unexpected behavior through examining the components of SSDs such as channels, DRAM buffer, and Native Command Queuing (NCQ). We introduce two novel SSD-aware host level I/O schedulers on Linux, called A & x002B;CFQ and H & x002B;BFQ, based on our analysis and findings. Through experiments on Linux, we analyze I/O proportionality and stability in multi-tenant environments. In addition, through experiments using real workloads, we analyze the performance interference between workloads on a shared SSD. We then show that the proposed I/O schedulers almost eliminate the interference effect seen in CFQ and BFQ, while still providing I/O proportionality and stability for various I/O weighted scenarios
OSMOSIS: Enabling Multi-Tenancy in Datacenter SmartNICs
Multi-tenancy is essential for unleashing SmartNIC's potential in
datacenters. Our systematic analysis in this work shows that existing on-path
SmartNICs have resource multiplexing limitations. For example, existing
solutions lack multi-tenancy capabilities such as performance isolation and QoS
provisioning for compute and IO resources. Compared to standard NIC data paths
with a well-defined set of offloaded functions, unpredictable execution times
of SmartNIC kernels make conventional approaches for multi-tenancy and QoS
insufficient. We fill this gap with OSMOSIS, a SmartNICs resource manager
co-design. OSMOSIS extends existing OS mechanisms to enable dynamic hardware
resource multiplexing on top of the on-path packet processing data plane. We
implement OSMOSIS within an open-source RISC-V-based 400Gbit/s SmartNIC. Our
performance results demonstrate that OSMOSIS fully supports multi-tenancy and
enables broader adoption of SmartNICs in datacenters with low overhead.Comment: 12 pages, 14 figures, 103 reference
Xar-Trek: Run-Time Execution Migration among FPGAs and Heterogeneous-ISA CPUs
Datacenter servers are increasingly heterogeneous: from x86 host CPUs, to ARM
or RISC-V CPUs in NICs/SSDs, to FPGAs. Previous works have demonstrated that
migrating application execution at run-time across heterogeneous-ISA CPUs can
yield significant performance and energy gains, with relatively little
programmer effort. However, FPGAs have often been overlooked in that context:
hardware acceleration using FPGAs involves statically implementing select
application functions, which prohibits dynamic and transparent migration. We
present Xar-Trek, a new compiler and run-time software framework that overcomes
this limitation. Xar-Trek compiles an application for several CPU ISAs and
select application functions for acceleration on an FPGA, allowing execution
migration between heterogeneous-ISA CPUs and FPGAs at run-time. Xar-Trek's
run-time monitors server workloads and migrates application functions to an
FPGA or to heterogeneous-ISA CPUs based on a scheduling policy. We develop a
heuristic policy that uses application workload profiles to make scheduling
decisions. Our evaluations conducted on a system with x86-64 server CPUs, ARM64
server CPUs, and an Alveo accelerator card reveal 88%-1% performance gains over
no-migration baselines
- …