337 research outputs found

    An In-Depth Analysis of the Slingshot Interconnect

    Full text link
    The interconnect is one of the most critical components in large scale computing systems, and its impact on the performance of applications is going to increase with the system size. In this paper, we will describe Slingshot, an interconnection network for large scale computing systems. Slingshot is based on high-radix switches, which allow building exascale and hyperscale datacenters networks with at most three switch-to-switch hops. Moreover, Slingshot provides efficient adaptive routing and congestion control algorithms, and highly tunable traffic classes. Slingshot uses an optimized Ethernet protocol, which allows it to be interoperable with standard Ethernet devices while providing high performance to HPC applications. We analyze the extent to which Slingshot provides these features, evaluating it on microbenchmarks and on several applications from the datacenter and AI worlds, as well as on HPC applications. We find that applications running on Slingshot are less affected by congestion compared to previous generation networks.Comment: To be published in Proceedings of The International Conference for High Performance Computing Networking, Storage, and Analysis (SC '20) (2020

    Diluting the Scalability Boundaries: Exploring the Use of Disaggregated Architectures for High-Level Network Data Analysis

    Get PDF
    Traditional data centers are designed with a rigid architecture of fit-for-purpose servers that provision resources beyond the average workload in order to deal with occasional peaks of data. Heterogeneous data centers are pushing towards more cost-efficient architectures with better resource provisioning. In this paper we study the feasibility of using disaggregated architectures for intensive data applications, in contrast to the monolithic approach of server-oriented architectures. Particularly, we have tested a proactive network analysis system in which the workload demands are highly variable. In the context of the dReDBox disaggregated architecture, the results show that the overhead caused by using remote memory resources is significant, between 66\% and 80\%, but we have also observed that the memory usage is one order of magnitude higher for the stress case with respect to average workloads. Therefore, dimensioning memory for the worst case in conventional systems will result in a notable waste of resources. Finally, we found that, for the selected use case, parallelism is limited by memory. Therefore, using a disaggregated architecture will allow for increased parallelism, which, at the same time, will mitigate the overhead caused by remote memory.Comment: 8 pages, 6 figures, 2 tables, 32 references. Pre-print. The paper will be presented during the IEEE International Conference on High Performance Computing and Communications in Bangkok, Thailand. 18 - 20 December, 2017. To be published in the conference proceeding

    On random wiring in practicable folded clos networks for modern datacenters

    Get PDF
    Big scale, high performance and fault-tolerance, low-cost and graceful expandability are pursued features in current datacenter networks (DCN). Although there have been many proposals for DCNs, most modern installations are equipped with classical folded Clos networks. Recently, regular random topologies, as the Jellyfish, have been proposed for DCNs. However, their completely unstructured nature entails serious design problems. In this paper we propose Random Folded Clos (RFC) and Hydra networks in which the interconnection between certain switches levels is made randomly. Both RFCs and Hydras preserve important properties of Clos networks that provide a straightforward deadlock-free multi-path routing. The proposed networks leverage randomness to be gracefully expandable, thereby allowing for fine grain upgrading. RFCs and Hydras are compared in the paper, in topological and cost terms, against fat-trees, orthogonal fat-trees and random regular networks. Also, experiments are carried out to simulate their performance under synthetic traffic patterns emulating common loads present in warehouse scale computers. These theoretical and empirical studies reveal the interest of these topologies, concluding that Hydra constitutes a practicable alternative to current datacenter networks since it appropriately balance all the main design requirements. Moreover, Hydras perform better than the fat-trees, their natural competitor, being able to connect the same or more computing nodes with significant lower cost and latency while exhibiting comparable throughput. © 1990-2012 IEEE

    A Survey of Prediction and Classification Techniques in Multicore Processor Systems

    Get PDF
    In multicore processor systems, being able to accurately predict the future provides new optimization opportunities, which otherwise could not be exploited. For example, an oracle able to predict a certain application\u27s behavior running on a smart phone could direct the power manager to switch to appropriate dynamic voltage and frequency scaling modes that would guarantee minimum levels of desired performance while saving energy consumption and thereby prolonging battery life. Using predictions enables systems to become proactive rather than continue to operate in a reactive manner. This prediction-based proactive approach has become increasingly popular in the design and optimization of integrated circuits and of multicore processor systems. Prediction transforms from simple forecasting to sophisticated machine learning based prediction and classification that learns from existing data, employs data mining, and predicts future behavior. This can be exploited by novel optimization techniques that can span across all layers of the computing stack. In this survey paper, we present a discussion of the most popular techniques on prediction and classification in the general context of computing systems with emphasis on multicore processors. The paper is far from comprehensive, but, it will help the reader interested in employing prediction in optimization of multicore processor systems

    Enabling Scalable and Sustainable Softwarized 5G Environments

    Get PDF
    The fifth generation of telecommunication systems (5G) is foreseen to play a fundamental role in our socio-economic growth by supporting various and radically new vertical applications (such as Industry 4.0, eHealth, Smart Cities/Electrical Grids, to name a few), as a one-fits-all technology that is enabled by emerging softwarization solutions \u2013 specifically, the Fog, Multi-access Edge Computing (MEC), Network Functions Virtualization (NFV) and Software-Defined Networking (SDN) paradigms. Notwithstanding the notable potential of the aforementioned technologies, a number of open issues still need to be addressed to ensure their complete rollout. This thesis is particularly developed towards addressing the scalability and sustainability issues in softwarized 5G environments through contributions in three research axes: a) Infrastructure Modeling and Analytics, b) Network Slicing and Mobility Management, and c) Network/Services Management and Control. The main contributions include a model-based analytics approach for real-time workload profiling and estimation of network key performance indicators (KPIs) in NFV infrastructures (NFVIs), as well as a SDN-based multi-clustering approach to scale geo-distributed virtual tenant networks (VTNs) and to support seamless user/service mobility; building on these, solutions to the problems of resource consolidation, service migration, and load balancing are also developed in the context of 5G. All in all, this generally entails the adoption of Stochastic Models, Mathematical Programming, Queueing Theory, Graph Theory and Team Theory principles, in the context of Green Networking, NFV and SDN

    Enhancing data centre networking using energy aware optical interconnects

    Get PDF
    In a fast changing world where information technology drives economic prosperity, the number of data centres has grown significantly in the past few years. These data centres require large amount of energy in order to meet up with increasing demands. An overview of energy efficient optical interconnects as a means of reducing energy consumption without compromising speed and accuracy was presented. New methods by which energy efficiency can be achieved using OCDMA multiplexing techniques for future optical interconnections were discussed. We also presented some challenges that might inhibit effective implementation of the OCDMA multiplexing scheme

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

    Integrated IT and SDN Orchestration of multi-domain multi-layer transport networks

    Get PDF
    Telecom operators networks' management and control remains partitioned by technology, equipment supplier and networking layer. In some segments, the network operations are highly costly due to the need of the individual, and even manual, configuration of the network equipment by highly specialized personnel. In multi-vendor networks, expensive and never ending integration processes between Network Management Systems (NMSs) and the rest of systems (OSSs, BSSs) is a common situation, due to lack of adoption of standard interfaces in the management systems of the different equipment suppliers. Moreover, the increasing impact of the new traffic flows introduced by the deployment of massive Data Centers (DCs) is also imposing new challenges that traditional networking is not ready to overcome. The Fifth Generation of Mobile Technology (5G) is also introducing stringent network requirements such as the need of connecting to the network billions of new devices in IoT paradigm, new ultra-low latency applications (i.e., remote surgery) and vehicular communications. All these new services, together with enhanced broadband network access, are supposed to be delivered over the same network infrastructure. In this PhD Thesis, an holistic view of Network and Cloud Computing resources, based on the recent innovations introduced by Software Defined Networking (SDN), is proposed as the solution for designing an end-to-end multi-layer, multi-technology and multi-domain cloud and transport network management architecture, capable to offer end-to-end services from the DC networks to customers access networks and the virtualization of network resources, allowing new ways of slicing the network resources for the forthcoming 5G deployments. The first contribution of this PhD Thesis deals with the design and validation of SDN based network orchestration architectures capable to improve the current solutions for the management and control of multi-layer, multi-domain backbone transport networks. These problems have been assessed and progressively solved by different control and management architectures which has been designed and evaluated in real evaluation environments. One of the major findings of this work has been the need of developed a common information model for transport network's management, capable to describe the resources and services of multilayer networks. In this line, the Control Orchestration Protocol (COP) has been proposed as a first contriution towards an standard management interface based on the main principles driven by SDN. Furthermore, this PhD Thesis introduces a novel architecture capable to coordinate the management of IT computing resources together with inter- and intra-DC networks. The provisioning and migration of virtual machines together with the dynamic reconfiguration of the network has been successfully demonstrated in a feasible timescale. Moreover, a resource optimization engine is introduced in the architecture to introduce optimization algorithms capable to solve allocation problems such the optimal deployment of Virtual Machine Graphs over different DCs locations minimizing the inter-DC network resources allocation. A baseline blocking probability results over different network loads are also presented. The third major contribution is the result of the previous two. With a converged cloud and network infrastructure controlled and operated jointly, the holistic view of the network allows the on-demand provisioning of network slices consisting of dedicated network and cloud resources over a distributed DC infrastructure interconnected by an optical transport network. The last chapters of this thesis discuss the management and orchestration of 5G slices based over the control and management components designed in the previous chapters. The design of one of the first network slicing architectures and the deployment of a 5G network slice in a real Testbed, is one of the major contributions of this PhD Thesis.La gestión y el control de las redes de los operadores de red (Telcos), todavía hoy, está segmentado por tecnología, por proveedor de equipamiento y por capa de red. En algunos segmentos (por ejemplo en IP) la operación de la red es tremendamente costosa, ya que en muchos casos aún se requiere con guración individual, e incluso manual, de los equipos por parte de personal altamente especializado. En redes con múltiples proveedores, los procesos de integración entre los sistemas de gestión de red (NMS) y el resto de sistemas (p. ej., OSS/BSS) son habitualmente largos y extremadamente costosos debido a la falta de adopción de interfaces estándar por parte de los diferentes proveedores de red. Además, el impacto creciente en las redes de transporte de los nuevos flujos de tráfico introducidos por el despliegue masivo de Data Centers (DC), introduce nuevos desafíos que las arquitecturas de gestión y control de las redes tradicionales no están preparadas para afrontar. La quinta generación de tecnología móvil (5G) introduce nuevos requisitos de red, como la necesidad de conectar a la red billones de dispositivos nuevos (Internet de las cosas - IoT), aplicaciones de ultra baja latencia (p. ej., cirugía a distancia) y las comunicaciones vehiculares. Todos estos servicios, junto con un acceso mejorado a la red de banda ancha, deberán ser proporcionados a través de la misma infraestructura de red. Esta tesis doctoral propone una visión holística de los recursos de red y cloud, basada en los principios introducidos por Software Defined Networking (SDN), como la solución para el diseño de una arquitectura de gestión extremo a extremo (E2E) para escenarios de red multi-capa y multi-dominio, capaz de ofrecer servicios de E2E, desde las redes intra-DC hasta las redes de acceso, y ofrecer ademas virtualización de los recursos de la red, permitiendo nuevas formas de segmentación en las redes de transporte y la infrastructura de cloud, para los próximos despliegues de 5G. La primera contribución de esta tesis consiste en la validación de arquitecturas de orquestración de red, basadas en SDN, para la gestión y control de redes de transporte troncales multi-dominio y multi-capa. Estos problemas (gestion de redes multi-capa y multi-dominio), han sido evaluados de manera incremental, mediante el diseño y la evaluación experimental, en entornos de pruebas reales, de diferentes arquitecturas de control y gestión. Uno de los principales hallazgos de este trabajo ha sido la necesidad de un modelo de información común para las interfaces de gestión entre entidades de control SDN. En esta línea, el Protocolo de Control Orchestration (COP) ha sido propuesto como interfaz de gestión de red estándar para redes SDN de transporte multi-capa. Además, en esta tesis presentamos una arquitectura capaz de coordinar la gestión de los recursos IT y red. La provisión y la migración de máquinas virtuales junto con la reconfiguración dinámica de la red, han sido demostradas con éxito en una escala de tiempo factible. Además, la arquitectura incorpora una plataforma para la ejecución de algoritmos de optimización de recursos capaces de resolver diferentes problemas de asignación, como el despliegue óptimo de Grafos de Máquinas Virtuales (VMG) en diferentes DCs que minimizan la asignación de recursos de red. Esta tesis propone una solución para este problema, que ha sido evaluada en terminos de probabilidad de bloqueo para diferentes cargas de red. La tercera contribución es el resultado de las dos anteriores. La arquitectura integrada de red y cloud presentada permite la creación bajo demanda de "network slices", que consisten en sub-conjuntos de recursos de red y cloud dedicados para diferentes clientes sobre una infraestructura común. El diseño de una de las primeras arquitecturas de "network slicing" y el despliegue de un "slice" de red 5G totalmente operativo en un Testbed real, es una de las principales contribuciones de esta tesis.La gestió i el control de les xarxes dels operadors de telecomunicacions (Telcos), encara avui, està segmentat per tecnologia, per proveïdors d’equipament i per capes de xarxa. En alguns segments (Per exemple en IP) l’operació de la xarxa és tremendament costosa, ja que en molts casos encara es requereix de configuració individual, i fins i tot manual, dels equips per part de personal altament especialitzat. En xarxes amb múltiples proveïdors, els processos d’integració entre els Sistemes de gestió de xarxa (NMS) i la resta de sistemes (per exemple, Sistemes de suport d’operacions - OSS i Sistemes de suport de negocis - BSS) són habitualment interminables i extremadament costosos a causa de la falta d’adopció d’interfícies estàndard per part dels diferents proveïdors de xarxa. A més, l’impacte creixent en les xarxes de transport dels nous fluxos de trànsit introduïts pel desplegament massius de Data Centers (DC), introdueix nous desafiaments que les arquitectures de gestió i control de les xarxes tradicionals que no estan llestes per afrontar. Per acabar de descriure el context, la cinquena generació de tecnologia mòbil (5G) també presenta nous requisits de xarxa altament exigents, com la necessitat de connectar a la xarxa milers de milions de dispositius nous, dins el context de l’Internet de les coses (IOT), o les noves aplicacions d’ultra baixa latència (com ara la cirurgia a distància) i les comunicacions vehiculars. Se suposa que tots aquests nous serveis, juntament amb l’accés millorat a la xarxa de banda ampla, es lliuraran a través de la mateixa infraestructura de xarxa. Aquesta tesi doctoral proposa una visió holística dels recursos de xarxa i cloud, basada en els principis introduïts per Software Defined Networking (SDN), com la solució per al disseny de una arquitectura de gestió extrem a extrem per a escenaris de xarxa multi-capa, multi-domini i consistents en múltiples tecnologies de transport. Aquesta arquitectura de gestió i control de xarxes transport i recursos IT, ha de ser capaç d’oferir serveis d’extrem a extrem, des de les xarxes intra-DC fins a les xarxes d’accés dels clients i oferir a més virtualització dels recursos de la xarxa, obrint la porta a noves formes de segmentació a les xarxes de transport i la infrastructura de cloud, pels propers desplegaments de 5G. La primera contribució d’aquesta tesi doctoral consisteix en la validació de diferents arquitectures d’orquestració de xarxa basades en SDN capaces de millorar les solucions existents per a la gestió i control de xarxes de transport troncals multi-domini i multicapa. Aquests problemes (gestió de xarxes multicapa i multi-domini), han estat avaluats de manera incremental, mitjançant el disseny i l’avaluació experimental, en entorns de proves reals, de diferents arquitectures de control i gestió. Un dels principals troballes d’aquest treball ha estat la necessitat de dissenyar un model d’informació comú per a les interfícies de gestió de xarxes, capaç de descriure els recursos i serveis de la xarxes transport multicapa. En aquesta línia, el Protocol de Control Orchestration (COP, en les seves sigles en anglès) ha estat proposat en aquesta Tesi, com una primera contribució cap a una interfície de gestió de xarxa estàndard basada en els principis bàsics de SDN. A més, en aquesta tesi presentem una arquitectura innovadora capaç de coordinar la gestió de els recursos IT juntament amb les xarxes inter i intra-DC. L’aprovisionament i la migració de màquines virtuals juntament amb la reconfiguració dinàmica de la xarxa, ha estat demostrat amb èxit en una escala de temps factible. A més, l’arquitectura incorpora una plataforma per a l’execució d’algorismes d’optimització de recursos, capaços de resoldre diferents problemes d’assignació, com el desplegament òptim de Grafs de Màquines Virtuals (VMG) en diferents ubicacions de DC que minimitzen la assignació de recursos de xarxa entre DC. També es presenta una solució bàsica per a aquest problema, així com els resultats de probabilitat de bloqueig per a diferents càrregues de xarxa. La tercera contribució principal és el resultat dels dos anteriors. Amb una infraestructura de xarxa i cloud convergent, controlada i operada de manera conjunta, la visió holística de la xarxa permet l’aprovisionament sota demanda de "network slices" que consisteixen en subconjunts de recursos d’xarxa i cloud, dedicats per a diferents clients, sobre una infraestructura de Data Centers distribuïda i interconnectada per una xarxa de transport òptica. Els últims capítols d’aquesta tesi tracten sobre la gestió i organització de "network slices" per a xarxes 5G en funció dels components de control i administració dissenyats i desenvolupats en els capítols anteriors. El disseny d’una de les primeres arquitectures de "network slicing" i el desplegament d’un "slice" de xarxa 5G totalment operatiu en un Testbed real, és una de les principals contribucions d’aquesta tesi.Postprint (published version

    A survey on architectures and energy efficiency in Data Center Networks

    Get PDF
    Data Center Networks (DCNs) are attracting growing interest from both academia and industry to keep pace with the exponential growth in cloud computing and enterprise networks. Modern DCNs are facing two main challenges of scalability and cost-effectiveness. The architecture of a DCN directly impacts on its scalability, while its cost is largely driven by its power consumption. In this paper, we conduct a detailed survey of the most recent advances and research activities in DCNs, with a special focus on the architectural evolution of DCNs and their energy efficiency. The paper provides a qualitative categorization of existing DCN architectures into switch-centric and server-centric topologies as well as their design technologies. Energy efficiency in data centers is discussed in details with survey of existing techniques in energy savings, green data centers and renewable energy approaches. Finally, we outline potential future research directions in DCNs
    • …
    corecore