6 research outputs found
A Companion Study Guide for the Cisco DCICN Data Center Certification Exam (200-150)
The official Cisco DCICN book and practice exams are great resources, but this is not an easy exam. This study guide is a companion to those resources and summarizes the subject areas into additional review questions with an answer description for each item. This book is not a braindump and it is not bootleg screenshots of the actual exam. Instead, this book provides additional context and examples, serves to complement other study guides, and provides additional examples. If you are getting ready to take the exam for the first time, I hope that this guide provides the extra help to pass! If you are up for re-certification, I hope that this guide serves as a refresher and reminder! Keep working hard, keep studying, and never stop learning…https://digitalcommons.odu.edu/distancelearning_books/1000/thumbnail.jp
Fairness in a data center
Existing data centers utilize several networking technologies in order to handle the performance requirements of different workloads. Maintaining diverse networking technologies increases complexity and is not cost effective. This results in the current trend to converge all traffic into a single networking fabric. Ethernet is both cost-effective and ubiquitous, and as such it has been chosen as the technology of choice for the converged fabric. However, traditional Ethernet does not satisfy the needs of all traffic workloads, for the most part, due to its lossy nature and, therefore, has to be enhanced to allow for full convergence. The resulting technology, Data Center Bridging (DCB), is a new set of standards defined by the IEEE to make Ethernet lossless even in the presence of congestion. As with any new networking technology, it is critical to analyze how the different protocols within DCB interact with each other as well as how each protocol interacts with existing technologies in other layers of the protocol stack.
This dissertation presents two novel schemes that address critical issues in DCB networks: fairness with respect to packet lengths and fairness with respect to flow control and bandwidth utilization. The Deficit Round Robin with Adaptive Weight Control (DRR-AWC) algorithm actively monitors the incoming streams and adjusts the scheduling weights of the outbound port. The algorithm was implemented on a real DCB switch and shown to increase fairness for traffic consisting of mixed-length packets. Targeted Priority-based Flow Control (TPFC) provides a hop-by-hop flow control mechanism that restricts the flow of aggressor streams while allowing victim streams to continue unimpeded. Two variants of the targeting mechanism within TPFC are presented and their performance evaluated through simulation
Non-minimal adaptive routing for efficient interconnection networks
RESUMEN: La red de interconexión es un concepto clave de los sistemas de computación paralelos. El primer aspecto que define una red de interconexión es su topología. Habitualmente, las redes escalables y eficientes en términos de coste y consumo energético tienen bajo diámetro y se basan en topologías que encaran el límite de Moore y en las que no hay diversidad de caminos mínimos. Una vez definida la topología, quedando implícitamente definidos los límites de rendimiento de la red, es necesario diseñar un algoritmo de enrutamiento que se acerque lo máximo posible a esos límites y debido a la ausencia de caminos mínimos, este además debe explotar los caminos no mínimos cuando el tráfico es adverso. Estos algoritmos de enrutamiento habitualmente seleccionan entre rutas mínimas y no mínimas en base a las condiciones de la red. Las rutas no mínimas habitualmente se basan en el algoritmo de balanceo de carga propuesto por Valiant, esto implica que doblan la longitud de las rutas mínimas y por lo tanto, la latencia soportada por los paquetes se incrementa. En cuanto a la tecnología, desde su introducción en entornos HPC a principios de los años 2000, Ethernet ha sido usado en un porcentaje representativo de los sistemas.
Esta tesis introduce una implementación realista y competitiva de una red escalable y sin pérdidas basada en dispositivos de red Ethernet commodity, considerando topologías de bajo diámetro y bajo consumo energético y logrando un ahorro energético de hasta un 54%. Además, propone un enrutamiento sobre la citada arquitectura, en adelante QCN-Switch, el cual selecciona entre rutas mínimas y no mínimas basado en notificaciones de congestión explícitas. Una vez implementada la decisión de enrutar siguiendo rutas no mínimas, se introduce un enrutamiento adaptativo en fuente capaz de adaptar el número de saltos en las rutas no mínimas. Este enrutamiento, en adelante ACOR, es agnóstico de la topología y mejora la latencia en hasta un 28%. Finalmente, se introduce un enrutamiento dependiente de la topología, en adelante LIAN, que optimiza el número de saltos de las rutas no mínimas basado en las condiciones de la red. Los resultados de su evaluación muestran que obtiene una latencia cuasi óptima y mejora el rendimiento de algoritmos de enrutamiento actuales reduciendo la latencia en hasta un 30% y obteniendo un rendimiento estable y equitativo.ABSTRACT: Interconnection network is a key concept of any parallel computing system. The first aspect to define an interconnection network is its topology. Typically, power and cost-efficient scalable networks with low diameter rely on topologies that approach the Moore bound in which there is no minimal path diversity. Once the topology is defined, the performance bounds of the network are determined consequently, so a suitable routing algorithm should be designed to accomplish as much as possible of those limits and, due to the lack of minimal path diversity, it must exploit non-minimal paths when the traffic pattern is adversarial. These routing algorithms usually select between minimal and non-minimal paths based on the network conditions, where the non-minimal paths are built according to Valiant load-balancing algorithm. This implies that these paths double the length of minimal ones and then the latency supported by packets increases. Regarding the technology, from its introduction in HPC systems in the early 2000s, Ethernet has been used in a significant fraction of the systems.
This dissertation introduces a realistic and competitive implementation of a scalable lossless Ethernet network for HPC environments considering low-diameter and low-power topologies. This allows for up to 54% power savings. Furthermore, it proposes a routing upon the cited architecture, hereon QCN-Switch, which selects between minimal and non-minimal paths per packet based on explicit congestion notifications instead of credits. Once the miss-routing decision is implemented, it introduces two mechanisms regarding the selection of the intermediate switch to develop a source adaptive routing algorithm capable of adapting the number of hops in the non-minimal paths. This routing, hereon ACOR, is topology-agnostic and improves average latency in all cases up to 28%. Finally, a topology-dependent routing, hereon LIAN, is introduced to optimize the number of hops in the non-minimal paths based on the network live conditions. Evaluations show that LIAN obtains almost-optimal latency and outperforms state-of-the-art adaptive routing algorithms, reducing latency by up to 30.0% and providing stable throughput and fairness.This work has been supported by the Spanish Ministry of Education, Culture and Sports
under grant FPU14/02253, the Spanish Ministry of Economy, Industry and Competitiveness
under contracts TIN2010-21291-C02-02, TIN2013-46957-C2-2-P, and TIN2013-46957-C2-2-P (AEI/FEDER, UE), the Spanish Research Agency under contract PID2019-105660RBC22/AEI/10.13039/501100011033, the European Union under agreements FP7-ICT-2011-
7-288777 (Mont-Blanc 1) and FP7-ICT-2013-10-610402 (Mont-Blanc 2), the University of
Cantabria under project PAR.30.P072.64004, and by the European HiPEAC Network of Excellence through an internship grant supported by the European Union’s Horizon 2020 research
and innovation program under grant agreement No. H2020-ICT-2015-687689
Datacenter Traffic Control: Understanding Techniques and Trade-offs
Datacenters provide cost-effective and flexible access to scalable compute
and storage resources necessary for today's cloud computing needs. A typical
datacenter is made up of thousands of servers connected with a large network
and usually managed by one operator. To provide quality access to the variety
of applications and services hosted on datacenters and maximize performance, it
deems necessary to use datacenter networks effectively and efficiently.
Datacenter traffic is often a mix of several classes with different priorities
and requirements. This includes user-generated interactive traffic, traffic
with deadlines, and long-running traffic. To this end, custom transport
protocols and traffic management techniques have been developed to improve
datacenter network performance.
In this tutorial paper, we review the general architecture of datacenter
networks, various topologies proposed for them, their traffic properties,
general traffic control challenges in datacenters and general traffic control
objectives. The purpose of this paper is to bring out the important
characteristics of traffic control in datacenters and not to survey all
existing solutions (as it is virtually impossible due to massive body of
existing research). We hope to provide readers with a wide range of options and
factors while considering a variety of traffic control mechanisms. We discuss
various characteristics of datacenter traffic control including management
schemes, transmission control, traffic shaping, prioritization, load balancing,
multipathing, and traffic scheduling. Next, we point to several open challenges
as well as new and interesting networking paradigms. At the end of this paper,
we briefly review inter-datacenter networks that connect geographically
dispersed datacenters which have been receiving increasing attention recently
and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
Management, Optimization and Evolution of the LHCb Online Network
The LHCb experiment is one of the four large particle detectors running at the
Large Hadron Collider (LHC) at CERN. It is a forward single-arm spectrometer dedicated to test the Standard Model through precision measurements of
Charge-Parity (CP) violation and rare decays in the b quark sector. The LHCb
experiment will operate at a luminosity of 2x10^32cm-2s-1, the proton-proton
bunch crossings rate will be approximately 10 MHz. To select the interesting
events, a two-level trigger scheme is applied: the rst level trigger (L0) and the
high level trigger (HLT). The L0 trigger is implemented in custom hardware,
while HLT is implemented in software runs on the CPUs of the Event Filter
Farm (EFF). The L0 trigger rate is dened at about 1 MHz, and the event size
for each event is about 35 kByte. It is a serious challenge to handle the resulting
data rate (35 GByte/s).
The Online system is a key part of the LHCb experiment, providing all the
IT services. It consists of three major components: the Data Acquisition (DAQ)
system, the Timing and Fast Control (TFC) system and the Experiment Control
System (ECS). To provide the services, two large dedicated networks based on
Gigabit Ethernet are deployed: one for DAQ and another one for ECS, which are
referred to Online network in general. A large network needs sophisticated monitoring for its successful operation. Commercial network management systems are
quite expensive and dicult to integrate into the LHCb ECS. A custom network
monitoring system has been implemented based on a Supervisory Control And
Data Acquisition (SCADA) system called PVSS which is used by LHCb ECS. It
is a homogeneous part of the LHCb ECS. In this thesis, it is demonstrated how
a large scale network can be monitored and managed using tools originally made
for industrial supervisory control.
The thesis is organized as the follows:
Chapter 1 gives a brief introduction to LHC and the B physics on LHC,
then describes all sub-detectors and the trigger and DAQ system of LHCb from
structure to performance.
Chapter 2 first introduces the LHCb Online system and the dataflow, then
focuses on the Online network design and its optimization.
In Chapter 3, the SCADA system PVSS is introduced briefly,
then the
architecture and implementation of the network monitoring system are described
in detail, including the front-end processes, the data communication and the
supervisory layer.
Chapter 4 first discusses the packet sampling theory and one of the packet
sampling mechanisms: sFlow, then demonstrates the applications of sFlow for
the network trouble-shooting, the traffic monitoring and the anomaly detection.
In Chapter 5, the upgrade of LHC and LHCb is introduced, the possible
architecture of DAQ is discussed, and two candidate internetworking technologies (high speed Ethernet and InfniBand) are compared in different aspects for
DAQ. Three schemes based on 10 Gigabit Ethernet are presented and studied.
Chapter 6 is a general summary of the thesis