Search CORE

3 research outputs found

Non-minimal adaptive routing for efficient interconnection networks

Author: Benito Hoz Mariano
Publication venue
Publication date: 16/10/2020
Field of study

RESUMEN: La red de interconexión es un concepto clave de los sistemas de computación paralelos. El primer aspecto que define una red de interconexión es su topología. Habitualmente, las redes escalables y eficientes en términos de coste y consumo energético tienen bajo diámetro y se basan en topologías que encaran el límite de Moore y en las que no hay diversidad de caminos mínimos. Una vez definida la topología, quedando implícitamente definidos los límites de rendimiento de la red, es necesario diseñar un algoritmo de enrutamiento que se acerque lo máximo posible a esos límites y debido a la ausencia de caminos mínimos, este además debe explotar los caminos no mínimos cuando el tráfico es adverso. Estos algoritmos de enrutamiento habitualmente seleccionan entre rutas mínimas y no mínimas en base a las condiciones de la red. Las rutas no mínimas habitualmente se basan en el algoritmo de balanceo de carga propuesto por Valiant, esto implica que doblan la longitud de las rutas mínimas y por lo tanto, la latencia soportada por los paquetes se incrementa. En cuanto a la tecnología, desde su introducción en entornos HPC a principios de los años 2000, Ethernet ha sido usado en un porcentaje representativo de los sistemas. Esta tesis introduce una implementación realista y competitiva de una red escalable y sin pérdidas basada en dispositivos de red Ethernet commodity, considerando topologías de bajo diámetro y bajo consumo energético y logrando un ahorro energético de hasta un 54%. Además, propone un enrutamiento sobre la citada arquitectura, en adelante QCN-Switch, el cual selecciona entre rutas mínimas y no mínimas basado en notificaciones de congestión explícitas. Una vez implementada la decisión de enrutar siguiendo rutas no mínimas, se introduce un enrutamiento adaptativo en fuente capaz de adaptar el número de saltos en las rutas no mínimas. Este enrutamiento, en adelante ACOR, es agnóstico de la topología y mejora la latencia en hasta un 28%. Finalmente, se introduce un enrutamiento dependiente de la topología, en adelante LIAN, que optimiza el número de saltos de las rutas no mínimas basado en las condiciones de la red. Los resultados de su evaluación muestran que obtiene una latencia cuasi óptima y mejora el rendimiento de algoritmos de enrutamiento actuales reduciendo la latencia en hasta un 30% y obteniendo un rendimiento estable y equitativo.ABSTRACT: Interconnection network is a key concept of any parallel computing system. The first aspect to define an interconnection network is its topology. Typically, power and cost-efficient scalable networks with low diameter rely on topologies that approach the Moore bound in which there is no minimal path diversity. Once the topology is defined, the performance bounds of the network are determined consequently, so a suitable routing algorithm should be designed to accomplish as much as possible of those limits and, due to the lack of minimal path diversity, it must exploit non-minimal paths when the traffic pattern is adversarial. These routing algorithms usually select between minimal and non-minimal paths based on the network conditions, where the non-minimal paths are built according to Valiant load-balancing algorithm. This implies that these paths double the length of minimal ones and then the latency supported by packets increases. Regarding the technology, from its introduction in HPC systems in the early 2000s, Ethernet has been used in a significant fraction of the systems. This dissertation introduces a realistic and competitive implementation of a scalable lossless Ethernet network for HPC environments considering low-diameter and low-power topologies. This allows for up to 54% power savings. Furthermore, it proposes a routing upon the cited architecture, hereon QCN-Switch, which selects between minimal and non-minimal paths per packet based on explicit congestion notifications instead of credits. Once the miss-routing decision is implemented, it introduces two mechanisms regarding the selection of the intermediate switch to develop a source adaptive routing algorithm capable of adapting the number of hops in the non-minimal paths. This routing, hereon ACOR, is topology-agnostic and improves average latency in all cases up to 28%. Finally, a topology-dependent routing, hereon LIAN, is introduced to optimize the number of hops in the non-minimal paths based on the network live conditions. Evaluations show that LIAN obtains almost-optimal latency and outperforms state-of-the-art adaptive routing algorithms, reducing latency by up to 30.0% and providing stable throughput and fairness.This work has been supported by the Spanish Ministry of Education, Culture and Sports under grant FPU14/02253, the Spanish Ministry of Economy, Industry and Competitiveness under contracts TIN2010-21291-C02-02, TIN2013-46957-C2-2-P, and TIN2013-46957-C2-2-P (AEI/FEDER, UE), the Spanish Research Agency under contract PID2019-105660RBC22/AEI/10.13039/501100011033, the European Union under agreements FP7-ICT-2011- 7-288777 (Mont-Blanc 1) and FP7-ICT-2013-10-610402 (Mont-Blanc 2), the University of Cantabria under project PAR.30.P072.64004, and by the European HiPEAC Network of Excellence through an internship grant supported by the European Union’s Horizon 2020 research and innovation program under grant agreement No. H2020-ICT-2015-687689

UCrea

Analysis and Development of Simulation Tools for Exascale Supercomputers System Networks

Author: Benito Hoz Mariano
Publication venue
Publication date: 01/01/2014
Field of study

RESUMEN: Tanto los nuevos sistemas High-Performance Computing (HPC) que se desarrollarán en un futuro como el prototipo construido en el seno del proyecto europeo Mont-Blanc necesitarán redes de alto rendimiento. Estas redes deben ser capaces de dotar de conectividad a centenares de miles o incluso millones de nodos de cómputo para lograr el rendimiento que se obtendrá con los futuros supercomputadores Exascale. Para presentar aportaciones al diseño de estas redes es necesaria una herramienta con la que poder evaluarlas, y que permita analizar su coste, rendimiento y consumo energético. La herramienta a utilizar debe admitir nuevos desarrollos de forma ágil pero a la vez tiene que ser fiable y, debido a los tamaños que se van a manejar en el futuro, tiene que ser escalable y permitir abordar experimentos con redes de gran tamaño. En este trabajo se analiza el simulador BookSim y se comprueban sus capacidades analizando soluciones planteadas recientemente para redes de alto grado. Durante la realización de los experimentos se han obtenido datos que permiten analizar la viabilidad de este simulador como herramienta futura para el análisis de redes para los mencionados supercomputadores Exascale. Del trabajo realizado se concluye principalmente que el simulador ha permitido la evaluación de los casos de uso propuestos y que a priori puede ser la herramienta elegida para futuros desarrollos, puesto que permite llegar a simular redes con un millón de nodos y la incorporación de nuevas redes a evaluar en un futuro. Asimismo, durante la realización de los mismos se ha concluido que el uso del trunking para balancear redes Flattened Butterly (FB) asimétricas y redes Dragonfly (DF) desbalanceadas es válido. Concretamente en este último caso, aunque existen dos tipos de trunking posibles, se concluye que desde un punto de vista tecnológico no tiene sentido plantearse uno de ellos. Aunque se da por válido el simulador para su posterior uso en futuros proyectos, se desea destacar que la ausencia de documentación sobre el mismo y la inactividad de su lista de usuarios no facilita el uso ni el desarrollo sobre el mismo.ABSTRACT: The new High-Performance Computing (HPC) systems that will be developed in the future, as well as the prototype built in the bosom of the Mont-Blanc european project will need high performance networks. These networks must be able to connect hundreds of thousands or even millions of nodes to achieve the performance which will be reached with the future Exascale supercomputers. In order to submit contributions to the design of these networks, a tool to evaluate them and that allows to analyse their cost, performance and power cost is needed. The tool to be used must allow new developments easily, but at the same time it must be reliable and, due to the sizes that are going to be handled in the future, it must be scalable and allow carrying out experiments with big-size networks. In this paper, the Booksim simulator is analysed, checking its abilities analysing solutions which have been recently suggested for high-radix networks. During the realization of the experiments, data has been obtained, which allows to analyse the viability of this simulator as a future tool for the analysis of networks for the mentioned Exascale supercomputers. As a main conclusion, it can be said that the simulator has allowed the evaluation of the use cases suggested and that it can a priori be the tool chosen for future developments, since it lets simulate networks with a million nodes and the incorporation of new networks to be evaluated in the future. Furthermore, during the realization of those use cases, it has been concluded that the use of trunking to balance asymmetric Flattened Butterly (FB) networks and unbalanced Dragonfly (DF) networks is valid. Specifically in this last case, although there are two type of possible trunking, it is concluded that from a technological point of view it does not make sense to think of one of them. Although the simulator is considered valid for its use in future projects, it is worth mentioning that the lack of documentation about it and the inactivity of its users’list does not facilitate to use it or develop on it.Máster en Computació

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UCrea

Deploying a private IAAs cloud computing environment

Author: Benito Hoz Mariano
Publication venue
Publication date: 01/09/2013
Field of study

RESUMEN: El departamento de operaciones de CIC Consulting Informático ofrece sus servicios Information Technology (IT) a clientes de la compañía y al resto de departamentos de la propia empresa, los cuales se dedican principalmente a la programación en diversas áreas de negocio. Las necesidades de estos departamentos están en continuo crecimiento y, además, el departamento de operaciones está asumiendo cada vez más carga de trabajo por parte de sus clientes. Estos hechos están provocando que poco a poco el servicio ofrecido a las áreas de desarrollo por el departamento de operaciones se esté volviendo menos ágil y genere retrasos a estos departamentos. Junto con el responsable de producción de la compañía, el coordinador del área de operaciones ha llegado a la conclusión de que una posible forma de abordar este problema es la implantación de un sistema que permita a los programadores el auto-aprovisionamiento de la infraestructura necesaria para realizar sus proyectos. Con esto se lograría la flexibilidad y ligereza que se busca para el servicio a prestar por el departamento de operaciones. Además, el sistema debe permitir mantener el control de ciertos aspectos como el uso de los recursos, el dimensionado de las máquinas virtuales, la seguridad de la plataforma y la estandarización tecnológica, entre otros. Por todos estos motivos, este Proyecto Fin de Carrera tiene como objetivo el despliegue e implantación de un entorno Infrastructure As A Service (IAAS) de Cloud Computing privado para las áreas de desarrollo de CIC. Tras un estudio inicial de las herramientas disponibles, se optó por elegir como base el proyecto OpenStack. Los motivos de esta elección se basan en que dicha herramienta cumple los requisitos especificados a alto nivel, permite trabajar con tecnologías abiertas y sobre la implementación de lo que pretende ser el primer estándar de IAAS. Durante la realización del proyecto se ha trabajado sobre distintos campos, como pueden ser la virtualización, los sistemas de almacenamiento y las redes de comunicaciones. En ellos se han utilizado diferentes tecnologías y estándares como paravirtualización, iSCSI, volúmenes lógicos, VLANs, etc. Además, aunque el equipamiento físico sobre el que se llevó a cabo el proyecto no es el habitual con el que trabaja la compañía, fue necesario plantear y adaptar el despliegue a las tecnologías usadas habitualmente en esta empresa.ABSTRACT: CIC Consulting Informático Operations Department offers its Information Technology (IT) services to company customers and to the rest of its own departments, which are dedicated to programming in different business areas. The necessities of these departments are continuously growing and, moreover, the Operations Department is assuming a greater amount of work. These facts are causing that the service offered to the development areas by the Operations Department is becoming less flexible and generating delays to these departments. Together with the Production Responsible of the company, the Operations Area coordinator came to the conclusion that a possible way to deal with this problem is to introduce a system which allows programmers to supply themselves with the necessary infrastructure for their projects. With this, the flexibility and agility needed by the Operations Department would be achieved. Furthermore, the system should allow users to keep the control of some aspects, such as the use of the resources, the measurement of virtual machines, the security of the platform and the technological standardization. Therefore, the aim of this Degree Project is to deploy and implement an Infrastructure As A Service (IAAS) private Cloud Computing environment for CIC Development Areas. The OpenStack project was chosen as base after a first study of available tools. The reasons of this choice were that this tool achieves the high level requirements, allows users to work with open technologies and works over what is expected to be the first IAAS standard. During the development of this project, different fields have been tackled, such as virtualization, storage systems and communication networks. In each one, different technologies and standards have been applied, such as paravirtualization, iSCSI, logical volumes, VLANs, etc. In addition to this, despite the fact that the physical equipment used for implementing the project is not usual in the company, it was necessary to consider and adapt the deployment to the technologies most commonly used in this company.Ingeniería en Informátic

UCrea