179 research outputs found
High-speed, in-band performance measurement instrumentation for next generation IP networks
Facilitating always-on instrumentation of Internet traffic for the purposes of performance measurement is crucial in order to enable accountability of resource usage and automated network control, management and optimisation. This has proven infeasible to date due to the lack of native measurement mechanisms that can form an integral part of the network‟s main forwarding operation. However, Internet Protocol version 6 (IPv6) specification enables the efficient encoding and processing of optional per-packet information as a native part of the network layer, and this constitutes a strong reason for IPv6 to be adopted as the ubiquitous next generation Internet transport.
In this paper we present a very high-speed hardware implementation of in-line measurement, a truly native traffic instrumentation mechanism for the next generation Internet, which facilitates performance measurement of the actual data-carrying traffic at small timescales between two points in the network. This system is designed to operate as part of the routers' fast path and to incur an absolutely minimal impact on the network operation even while instrumenting traffic between the edges of very high capacity links. Our results show that the implementation can be easily accommodated by current FPGA technology, and real Internet traffic traces verify that the overhead incurred by instrumenting every packet over a 10 Gb/s operational backbone link carrying a typical workload is indeed negligible
Optimality of a Network Monitoring Agent and Validation in a Real Probe
The evolution of commodity hardware makes it possible to use this type of equipment to implement traffic monitoring systems. A preliminary empirical evaluation of a network traffic probe based on Linux indicates that the system performance has significant losses as the network rate increases. To assess this issue, we consider a model with two tandem queues and a moving server. In this system, we formulate a three-dimensional Markov Decision Process in continuous time. The goal of the proposed model is to determine the position of the server in each time slot so as to optimize the system performance which is measured in terms of throughput. We first formulate an equivalent discrete-time Markov Decision Process and we propose a numerical method to characterize the solution of our problem in a general setting. The solution we obtain in this problem has been tested for a wide range of scenarios and, in all the instances, we observe that the optimality is close to a threshold type policy. We also consider a real probe and we validate the good performance of threshold policies in real applications.This research was partially supported by the Department of Education of the Basque Government, Spain through the Consolidated Research Groups NQaS (IT1635-22) and MATHMODE (IT1456-22), by the Marie Sklodowska-Curie, Spain grant agreement No 777778, by the Spanish Ministry of Science and Innovation, Spain with reference PID2019-108111RB-I00 (FEDER/AEI), by grant PID2020-117876RB-I00 funded by MCIN/AEI (10.13039/501100011033) and by Grant KK-2021/00026 funded by the Basque Government
Estudio de captura y almacenamiento de tráfico en redes físicas y virtuales multi-gigabit
Study and analyze a high speed network ( 10Gbps) is a challenge in terms
of the amount of data to be processed and the data rate itself. As a result, the networking
capture tools are usually very complex. Those tools also have to be continuously adapted to new
technology and higher data rates. To meet those requirements, each capture tool implements
its own formats and way to capture that difficulties its interoperability. In order to solve this
problem, it is necessary to develop a capture tool that stores and works with network data in a
well-known format. Standard formats, like PCAP, allow different applications to work together
easly, even in a paralel way. In the same way, common formats frees network analyzing tools
from the underlying network.
Typically, expensive dedicated servers are used to capture, store and process network data at
high speed rates. However, this is changing due to the proliferation of cloud computing and the
greatly improved performance virtualization technology. This trend makes difficult to find baremetal
servers or even network equipment in some environments. Therefore, it is becoming more
and more important to evaluate the performance and feasibility of capture and process network
data on virtual environments. To achieve that, a capture and store tool has been developed.
The tool can work at 10 Gbps thanks to Intel DPDK capture technology. A technology, that
also can work in both bare-metal and virtual environments. In this work, different methods
and capture tools are compared. In the same way, different virtualization methods provided
by KVM are evaluated. While running applications in virtual machines have a small overhead
compared with the bare-metal version, results show that performance in virtual environment is
really close to bare-metal environment. However, those results can only be reached using the
correct configuration and the latest advantages of the state-of-the-art hardware devices.Estudiar y analizar el comportamiento de una red a alta velocidad ( 10 Gbps)
supone un reto constante a medida que aumenta la velocidad de las redes de comunicaciones
debido a la gran cantidad de datos que se generan a diario y al propio hecho de procesar
información a tales velocidades. Por estos motivos, las herramientas encargadas de la captura
de datos son complejas y se encuentran, por lo general, en constante adaptación a las nuevas
tecnologías y velocidades, lo que dificulta considerablemente su integración directa con otras
aplicaciones de motorización o análisis de datos. Por ello es necesario que estas herramientas sean
capaces de capturar y almacenar los datos en un formato estándar en el que otras herramientas
puedan trabajar a posteriori o incluso en paralelo, con los datos de red independientemente de
la tecnología de captura utilizada.
Típicamente, este proceso de captura, almacenamiento y procesamiento de datos a alta
velocidad se ha realizado en máquinas dedicadas. No obstante, debido a la proliferación del
cloud computing y a la gran mejora en rendimiento de la tecnología de virtualización, esto está
cambiando, pudiéndose llegar al caso en el que sea raro disponer de una máquina física en la
que realizar estos procesos. Por ello, evaluar la viabilidad de realizar estos procesos de tan alto
rendimiento dentro de entornos virtuales comienza a cobrar importancia. Dentro de este contexto,
se ha desarrollado una herramienta de captura y almacenamiento en disco a 10 Gbps mediante la
tecnología de captura Intel DPDK, con la capacidad de funcionar tanto en entornos físicos como
virtuales. Del mismo modo, en este trabajo se presentan y se comparan diferentes métodos y
herramientas de captura, así como los diferentes métodos de virtualización de componentes que
ofrece KVM. A pesar de que el uso de máquinas virtuales impone un sobrecoste computacional a
cualquier aplicación, los resultados obtenidos muestran que el rendimiento en entornos virtuales
se asemeja mucho al rendimiento en entornos sin virtualización, siempre y cuando se utilice la
configuración adecuada que exprima las capacidades de los dispositivos actuales
Harnessing low-level tuning in modern architectures for high-performance network monitoring in physical and virtual platforms
Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Tecnología Electrónica y de las Comunicaciones. Fecha de lectura: 02-07-201
Multi-granular, multi-purpose and multi-Gb/s monitoring on off-the-shelf systems
This is the accepted version of the following article: [Moreno, V., Santiago del Río, P. M., Ramos, J., Muelas, D., García-Dorado, J. L., Gomez-Arribas, F. J. and Aracil, J. (2014), Multi-granular, multi-purpose and multi-Gb/s monitoring on off-the-shelf systems. Int. J. Network Mgmt., 24: 221–234. doi: 10.1002/nem.1861, which has been published in final form at http://onlinelibrary.wiley.com/doi/10.1002/nem.1861/abstractAs an attempt to make network managers’ life easier, we present M3Omon, a system architecture that helps
to develop monitoring applications and perform network diagnosis. M3Omon behaves as an intermediate
layer between the traffic and monitoring applications that provides advanced features, high performance and
low cost. Such advanced features leverage a multi-granular and multi-purpose approach to the monitoring
problem. Multi-granular monitoring gives answer to tasks that use traffic aggregates to identify an event,
and requires either flow records or packet data or even both to understand it and, eventually, take the
convenient countermeasures. M3Omon provides a simple API to access traffic simultaneously at several different
granularities—i.e., packet-level, flow-level and aggregate statistics. The multi-purposed design of M3Omon
allows not only performing tasks in parallel that are specifically targeted to different traffic-related purposes
(e.g., traffic classification and intrusion detection) but also sharing granularities between applications—e.g.,
several concurrent applications fed from flow records that are provided by M3Omon. Finally, the low-cost
characteristic is brought by off-the-shelf systems (the combination of open-source software and commodity
hardware) and the high performance is achieved thanks to modifications in the standard NIC driver, low-level
hardware interaction, efficient memory management and programming optimization
Study on the Performance of TCP over 10Gbps High Speed Networks
Internet traffic is expected to grow phenomenally over the next five to ten years. To cope with such large traffic volumes, high-speed networks are expected to scale to capacities of terabits-per-second and beyond. Increasing the role of optics for packet forwarding and transmission inside the high-speed networks seems to be the most promising way to accomplish this capacity scaling. Unfortunately, unlike electronic memory, it remains a formidable challenge to build even a few dozen packets of integrated all-optical buffers. On the other hand, many high-speed networks depend on the TCP/IP protocol for reliability which is typically implemented in software and is sensitive to buffer size. For example, TCP requires a buffer size of bandwidth delay product in switches/routers to maintain nearly 100\% link utilization. Otherwise, the performance will be much downgraded. But such large buffer will challenge hardware design and power consumption, and will generate queuing delay and jitter which again cause problems. Therefore, improve TCP performance over tiny buffered high-speed networks is a top priority. This dissertation studies the TCP performance in 10Gbps high-speed networks. First, a 10Gbps reconfigurable optical networking testbed is developed as a research environment. Second, a 10Gbps traffic sniffing tool is developed for measuring and analyzing TCP performance. New expressions for evaluating TCP loss synchronization are presented by carefully examining the congestion events of TCP. Based on observation, two basic reasons that cause performance problems are studied. We find that minimize TCP loss synchronization and reduce flow burstiness impact are critical keys to improve TCP performance in tiny buffered networks. Finally, we present a new TCP protocol called Multi-Channel TCP and a new congestion control algorithm called Desynchronized Multi-Channel TCP (DMCTCP). Our algorithm implementation takes advantage of a potential parallelism from the Multi-Path TCP in Linux. Over an emulated 10Gbps network ruled by routers with only a few dozen packets of buffers, our experimental results confirm that bottleneck link utilization can be much better improved by DMCTCP than by many other TCP variants. Our study is a new step towards the deployment of optical packet switching/routing networks
On the Exploration of FPGAs and High-Level Synthesis Capabilities on Multi-Gigabit-per-Second Networks
Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Tecnología Electrónica y de las Comunicaciones. Fecha de lectura: 24-01-2020Traffic on computer networks has faced an exponential grown in recent years.
Both links and communication equipment had to adapt in order to provide
a minimum quality of service required for current needs. However, in recent
years, a few factors have prevented commercial off-the-shelf hardware from
being able to keep pace with this growth rate, consequently, some software tools are
struggling to fulfill their tasks, especially at speeds higher than 10 Gbit/s. For this reason,
Field Programmable Gate Arrays (FPGAs) have arisen as an alternative to address the
most demanding tasks without the need to design an application specific integrated
circuit, this is in part to their flexibility and programmability in the field. Needless to say,
developing for FPGAs is well-known to be complex. Therefore, in this thesis we tackle
the use of FPGAs and High-Level Synthesis (HLS) languages in the context of computer
networks. We focus on the use of FPGA both in computer network monitoring application
and reliable data transmission at very high-speed. On the other hand, we intend to shed
light on the use of high level synthesis languages and boost FPGA applicability in the
context of computer networks so as to reduce development time and design complexity.
In the first part of the thesis, devoted to computer network monitoring. We take advantage
of the FPGA determinism in order to implement active monitoring probes, which
consist on sending a train of packets which is later used to obtain network parameters.
In this case, the determinism is key to reduce the uncertainty of the measurements.
The results of our experiments show that the FPGA implementations are much more
accurate and more precise than the software counterpart. At the same time, the FPGA
implementation is scalable in terms of network speed — 1, 10 and 100 Gbit/s. In the context of passive monitoring, we leverage the FPGA architecture to implement algorithms
able to thin cyphered traffic as well as removing duplicate packets. These two algorithms
straightforward in principle, but very useful to help traditional network analysis tools to
cope with their task at higher network speeds. On one hand, processing cyphered traffic
bring little benefits, on the other hand, processing duplicate traffic impacts negatively in
the performance of the software tools.
In the second part of the thesis, devoted to the TCP/IP stack. We explore the current
limitations of reliable data transmission using standard software at very high-speed.
Nowadays, the network is becoming an important bottleneck to fulfill current needs, in
particular in data centers. What is more, in recent years the deployment of 100 Gbit/s
network links has started. Consequently, there has been an increase scrutiny of how
networking functionality is deployed, furthermore, a wide range of approaches are
currently being explored to increase the efficiency of networks and tailor its functionality
to the actual needs of the application at hand. FPGAs arise as the perfect alternative to
deal with this problem. For this reason, in this thesis we develop Limago an FPGA-based
open-source implementation of a TCP/IP stack operating at 100 Gbit/s for Xilinx’s FPGAs.
Limago not only provides an unprecedented throughput, but also, provides a tiny latency
when compared to the software implementations, at least fifteen times. Limago is a key
contribution in some of the hottest topic at the moment, for instance, network-attached
FPGA and in-network data processing
- …