4,553 research outputs found
P4CEP: Towards In-Network Complex Event Processing
In-network computing using programmable networking hardware is a strong trend
in networking that promises to reduce latency and consumption of server
resources through offloading to network elements (programmable switches and
smart NICs). In particular, the data plane programming language P4 together
with powerful P4 networking hardware has spawned projects offloading services
into the network, e.g., consensus services or caching services. In this paper,
we present a novel case for in-network computing, namely, Complex Event
Processing (CEP). CEP processes streams of basic events, e.g., stemming from
networked sensors, into meaningful complex events. Traditionally, CEP
processing has been performed on servers or overlay networks. However, we argue
in this paper that CEP is a good candidate for in-network computing along the
communication path avoiding detouring streams to distant servers to minimize
communication latency while also exploiting processing capabilities of novel
networking hardware. We show that it is feasible to express CEP operations in
P4 and also present a tool to compile CEP operations, formulated in our P4CEP
rule specification language, to P4 code. Moreover, we identify challenges and
problems that we have encountered to show future research directions for
implementing full-fledged in-network CEP systems.Comment: 6 pages. Author's versio
Design and performance evaluation of switching architectures for high-speed Internet
The motivation for this thesis is the desire to build faster and scalable routers that efficiently handle the exponential traffic growth in the Internet. The Internet forwards information through a mesh of routers and switches, which has to keep up with the increasing demands of traffic. Shared-memory based switches are known to provide the best throughput-delay performance for a given memory size. In this thesis performance of commonly used memory-sharing schemes for the shared memory switches are evaluated under balanced and unbalanced bursty traffic. The scalability of shared-memory switches has been a research issue for quite sometime. One approach is to employ multiple memory modules and use them in parallel to enhance the capacity. The two well-known architectures in this category are (i) shared-multibuffer (SMB) switch architecture invented by Yamanaka et al. of Mitsubishi Electric Corporation, Japan; and (ii) the sliding-window (SW) switch architecture invented by Dr. Kumar of UTPA, Texas, USA. In this thesis, performance of these two architectures are evaluated and compared. Furthermore, in this thesis, the SW switch architecture is extended to enable priority switching to provide differentiated Quality of Service (QoS) for different traffic classes
Design and evaluation of high-performance packet switching schemes
The design of high-performance packet switches is essential to efficiently handle the exponential growth of data traffic in the next generation Internet. Shared-memory-based packet switches are known to provide the best possible delay-throughput performance and the lowest packet-loss rate compared with packet switches using other buffering strategies. However, scalability of shared-memory-based switching systems has been restricted by high memory bandwidth requirements, segregation of memory space and centralized control of switching functions that causes the switch performance to degrade as a shared-memory switch is grown in size. The new class of sliding-window based packet switches are known to overcome these problems associated with shared-memory switches. This thesis presents different schemes proposed earlier by Dr. Kumar for use in the sliding-window switch to allocate self-routing parameters. Comparative performance of these schemes have been evaluated in this thesis. The results show the scalability of the switch that can be achieved with different parameter assignment schemes. It is shown that not all assignment schemes have same performance. With appropriate assignment scheme, it is possible to achieve very high throughput-performance and switch size for sliding-window switches
Recommended from our members
Security analysis of the micro transport protocol with a misbehaving receiver
BitTorrent is the most widely used Peer-to-Peer (P2P) protocol and it comprises the largest share of traffic in Europe. To make BitTorrent more Internet Service Provider (ISP) friendly, BitTorrent Inc. invented the Micro Transport Protocol (uTP). It is based on UDP with a novel congestion control called Low Extra Delay Background Transport (LEDBAT). This protocol assumes that the receiver always gives correct feedback, since otherwise this deteriorates throughput or yields to corrupted data. We show through experimental investigation that a misbehaving uTP receiver, which is not interested in data integrity, can increase the bandwidth of the sender by up to five times. This can cause a congestion collapse and steal large share of a victim’s bandwidth. We present three attacks, which increase the bandwidth usage significantly. We have tested these attacks in a real world environment and show its severity both in terms of number of packets and total traffic generated. We also present a countermeasure for protecting against the attacks and evaluate the performance of that defence strategy
GPU peer-to-peer techniques applied to a cluster interconnect
Modern GPUs support special protocols to exchange data directly across the
PCI Express bus. While these protocols could be used to reduce GPU data
transmission times, basically by avoiding staging to host memory, they require
specific hardware features which are not available on current generation
network adapters. In this paper we describe the architectural modifications
required to implement peer-to-peer access to NVIDIA Fermi- and Kepler-class
GPUs on an FPGA-based cluster interconnect. Besides, the current software
implementation, which integrates this feature by minimally extending the RDMA
programming model, is discussed, as well as some issues raised while employing
it in a higher level API like MPI. Finally, the current limits of the technique
are studied by analyzing the performance improvements on low-level benchmarks
and on two GPU-accelerated applications, showing when and how they seem to
benefit from the GPU peer-to-peer method.Comment: paper accepted to CASS 201
- …