54 research outputs found
Roaming Edge vNFs using Glasgow Network Functions
While the network edge is becoming more important for the provision of customized services in next generation mobile networks, current NFV architectures are unsuitable to meet the increasing future demand. They rely on commodity servers with resource-hungry Virtual Machines that are unable to provide the high network function density and mobility requirements necessary for upcoming wide-area and 5G networks.
In this demo, we showcase Glasgow Network Functions (GNF), a virtualization framework suitable for next generation mobile networks that exploits lightweight network functions (NFs) deployed at the edge and transparently following users' devices as they roam between cells
RackBlox: A Software-Defined Rack-Scale Storage System with Network-Storage Co-Design
Software-defined networking (SDN) and software-defined flash (SDF) have been
serving as the backbone of modern data centers. They are managed separately to
handle I/O requests. At first glance, this is a reasonable design by following
the rack-scale hierarchical design principles. However, it suffers from
suboptimal end-to-end performance, due to the lack of coordination between SDN
and SDF.
In this paper, we co-design the SDN and SDF stack by redefining the functions
of their control plane and data plane, and splitting up them within a new
architecture named RackBlox. RackBlox decouples the storage management
functions of flash-based solid-state drives (SSDs), and allow the SDN to track
and manage the states of SSDs in a rack. Therefore, we can enable the state
sharing between SDN and SDF, and facilitate global storage resource management.
RackBlox has three major components: (1) coordinated I/O scheduling, in which
it dynamically adjusts the I/O scheduling in the storage stack with the
measured and predicted network latency, such that it can coordinate the effort
of I/O scheduling across the network and storage stack for achieving
predictable end-to-end performance; (2) coordinated garbage collection (GC), in
which it will coordinate the GC activities across the SSDs in a rack to
minimize their impact on incoming I/O requests; (3) rack-scale wear leveling,
in which it enables global wear leveling among SSDs in a rack by periodically
swapping data, for achieving improved device lifetime for the entire rack. We
implement RackBlox using programmable SSDs and switch. Our experiments
demonstrate that RackBlox can reduce the tail latency of I/O requests by up to
5.8x over state-of-the-art rack-scale storage systems.Comment: 14 pages. Published in published in ACM SIGOPS 29th Symposium on
Operating Systems Principles (SOSP'23
A one-pass clustering based sketch method for network monitoring
Network monitoring solutions need to cope with increasing network traffic volumes, as a result, sketch-based monitoring methods have been extensively studied to trade accuracy for memory scalability and storage reduction. However, sketches are sensitive to skewness in network flow distributions due to hash collisions, and need complicated performance optimization to adapt to line-rate packet streams. We provide Jellyfish, an efficient sketch method that performs one-pass clustering over the network stream. One-pass clustering is realized by adapting the monitoring granularity from the whole network flow to fragments called subflows, which not only reduces the ingestion rate but also provides an efficient intermediate representation for the input to the sketch. Jellyfish provides the network-flow level query interface by reconstructing the network-flow level counters by merging subflow records from the same network flow. We provide probabilistic analysis of the expected accuracy of both existing sketch methods and Jellyfish. Real-world trace-driven experiments show that Jellyfish reduces the average estimation errors by up to six orders of magnitude for per-flow queries, by six orders of magnitude for entropy queries, and up to ten times for heavy-hitter queries.This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 61972409; in part by Hong Kong Research Grants Council (RGC) under Grant TRS T41-603/20-R, Grant GRF-16213621, and Grant ITF ACCESS; in part by the Spanish I+D+i project TRAINER-A, funded by MCIN/AEI/10.13039/501100011033, under Grant PID2020-118011GB-C21; and in part by the Catalan Institution
for Research and Advanced Studies (ICREA Academia).Peer ReviewedPostprint (author's final draft
A Comprehensive Study on Off-path SmartNIC
SmartNIC has recently emerged as an attractive device to accelerate
distributed systems. However, there has been no comprehensive characterization
of SmartNIC especially on the network part. This paper presents the first
comprehensive study of off-path SmartNIC. Our experimental study uncovers the
key performance characteristics of the communication among the client, SmartNIC
SoC, and the host. We find without considering SmartNIC hardware architecture,
communications with it can cause up to 48% bandwidth degradation due to
performance anomalies. We also propose implications to address the anomalies.Comment: This is the short version. Full version will appear at OSDI2
Towards Scalable Network Traffic Measurement With Sketches
Driven by the ever-increasing data volume through the Internet, the per-port speed of network devices reached 400 Gbps, and high-end switches are capable of processing 25.6 Tbps of network traffic. To improve the efficiency and security of the network, network traffic measurement becomes more important than ever. For fast and accurate traffic measurement, managing an accurate working set of active flows (WSAF) at line rates is a key challenge. WSAF is usually located in high-speed but expensive memories, such as TCAM or SRAM, and thus their capacity is quite limited. To scale up the per-flow measurement, we pursue three thrusts. In the first thrust, we propose to use In-DRAM WSAF and put a compact data structure (i.e., sketch) called FlowRegulator before WSAF to compensate for DRAM\u27s slow access time. Per our results, FlowRegulator can substantially reduce massive influxes to WSAF without compromising measurement accuracy. In the second thrust, we integrate our sketch into a network system and propose an SDN-based WLAN monitoring and management framework called RFlow+, which can overcome the limitations of existing traffic measurement solutions (e.g., OpenFlow and sFlow), such as a limited view, incomplete flow statistics, and poor trade-off between measurement accuracy and CPU/network overheads. In the third thrust, we introduce a novel sampling scheme to deal with the poor trade-off that is provided by the standard simple random sampling (SRS). Even though SRS has been widely used in practice because of its simplicity, it provides non-uniform sampling rates for different flows, because it samples packets over an aggregated data flow. Starting with a simple idea that independent per-flow packet sampling provides the most accurate estimation of each flow, we introduce a new concept of per-flow systematic sampling, aiming to provide the same sampling rate across all flows. In addition, we provide a concrete sampling method called SketchFlow, which approximates the idea of the per-flow systematic sampling using a sketch saturation event
RF-Transformer: A Unified Backscatter Radio Hardware Abstraction
This paper presents RF-Transformer, a unified backscatter radio hardware
abstraction that allows a low-power IoT device to directly communicate with
heterogeneous wireless receivers at the minimum power consumption. Unlike
existing backscatter systems that are tailored to a specific wireless
communication protocol, RF-Transformer provides a programmable interface to the
micro-controller, allowing IoT devices to synthesize different types of
protocol-compliant backscatter signals sharing radically different PHY-layer
designs. To show the efficacy of our design, we implement a PCB prototype of
RF-Transformer on 2.4 GHz ISM band and showcase its capability on generating
standard ZigBee, Bluetooth, LoRa, and Wi-Fi 802.11b/g/n/ac packets. Our
extensive field studies show that RF-Transformer achieves 23.8 Mbps, 247.1
Kbps, 986.5 Kbps, and 27.3 Kbps throughput when generating standard Wi-Fi,
ZigBee, Bluetooth, and LoRa signals while consuming 7.6-74.2 less power than
their active counterparts. Our ASIC simulation based on the 65-nm CMOS process
shows that the power gain of RF-Transformer can further grow to 92-678. We
further integrate RF-Transformer with pressure sensors and present a case study
on detecting foot traffic density in hallways. Our 7-day case studies
demonstrate RFTransformer can reliably transmit sensor data to a commodity
gateway by synthesizing LoRa packets on top of Wi-Fi signals. Our experimental
results also verify the compatibility of RF-Transformer with commodity
receivers. Code and hardware schematics can be found at:
https://github.com/LeFsCC/RF-Transformer
FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short
We introduce FatPaths: a simple, generic, and robust routing architecture
that enables state-of-the-art low-diameter topologies such as Slim Fly to
achieve unprecedented performance. FatPaths targets Ethernet stacks in both HPC
supercomputers as well as cloud data centers and clusters. FatPaths exposes and
exploits the rich ("fat") diversity of both minimal and non-minimal paths for
high-performance multi-pathing. Moreover, FatPaths uses a redesigned "purified"
transport layer that removes virtually all TCP performance issues (e.g., the
slow start), and incorporates flowlet switching, a technique used to prevent
packet reordering in TCP networks, to enable very simple and effective load
balancing. Our design enables recent low-diameter topologies to outperform
powerful Clos designs, achieving 15% higher net throughput at 2x lower latency
for comparable cost. FatPaths will significantly accelerate Ethernet clusters
that form more than 50% of the Top500 list and it may become a standard routing
scheme for modern topologies
- …