Search CORE

820 research outputs found

Photonic Interconnection Networks for Exascale Computers

Author: Duro Gómez José
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 24/05/2021
Field of study

[ES] En los últimos años, distintos proyectos alrededor del mundo se han centrado en el diseño de supercomputadores capaces de alcanzar la meta de la computación a exascala, con el objetivo de soportar la ejecución de aplicaciones de gran importancia para la sociedad en diversos campos como el de la salud, la inteligencia artificial, etc. Teniendo en cuenta la creciente tendencia de la potencia computacional en cada generación de supercomputadores, este objetivo se prevee accesible en los próximos años. Alcanzar esta meta requiere abordar diversos retos en el diseño y desarrollo del sistema. Uno de los principales es conseguir unas comunicaciones rápidas y eficientes entre el inmenso número de nodos de computo y los sitemas de memoria. La tecnología fotónica proporciona ciertas ventajas frente a las redes eléctricas, como un mayor ancho de banda en los enlaces, un mayor paralelismo a nivel de comunicaciones gracias al DWDM o una mejor gestión del cableado gracias a su reducido tamaño. En la tesis se ha desarrollado un estudio de viabilidad y desarrollo de redes de interconexión haciendo uso de la tecnología fotónica para los futuros sistemas a exaescala dentro del proyecto europeo ExaNeSt. En primer lugar, se ha realizado un análisis y caracterización de aplicaciones exaescala. Este análisis se ha utilizado para conocer el comportamiento y requisitos de red que presentan las aplicaciones, y con ello guiarnos en el diseño de la red del sistema. El análisis considera tres parámetros: la distribución de mensajes en base a su tamaño y su tipo, el consumo de ancho de banda requerido a lo largo de la ejecución y la matriz de comunicación espacial entre los nodos. El estudio revela la necesidad de una red eficiente y rápida, debido a que la mayoría de las comunaciones se realizan en burst y con mensajes de un tamaño medio inferior a 50KB. A continuación, la tesis se centra en identificar los principales elementos que diferencian las redes fotónicas de las eléctricas. Identificamos una secuencia de pasos en el diseño de un simulador, ya sea haciéndolo desde cero con tecnología fotónica o adaptando un simulador de redes eléctricas existente para modelar la fotónica. Después se han realizado dos estudios de rendimiento y comparativas entre las actuales redes eléctricas y distintas configuraciones de redes fotónicas utilizando topologías clásicas. En el primer estudio, realizado tanto con tráfico sintético como con trazas de ExaNeSt en un toro, fat tree y dragonfly, se observa como la tecnología fotónica supone una clara mejora respecto a la eléctrica. Además, el estudio muestra que el parámetro que más afecta al rendimiento es el ancho de banda del canal fotónico. El segundo estudio muestra el comportamiento y rendimiento de aplicaciones reales en simulaciones a gran escala en una topología jellyfish. En este estudio se confirman las conclusiones obtenidas en el anterior, revelando además que la tecnología fotónica permite reducir la complejidad de algunas topologías, y por ende, el coste de la red. En los estudios realizados se ha observado una baja utilización de la red debido a que las topologías utilizadas para redes eléctricas no aprovechan las características que proporciona la tecnología fotónica. Por ello, se ha propuesto Segment Switching, una estrategia de conmutación orientada a reducir la longitud de las rutas mediante el uso de buffers intermedios. Los resultados experimentales muestran que cada topología tiene sus propios requerimientos. En el caso del toro, el mayor rendimiento se obtiene con un mayor número de buffers en la red. En el fat tree el parámetro más importante es el tamaño del buffer, obteniendo unas prestaciones similares una configuración con buffers en todos los switches que la que los ubica solo en el nivel superior. En resumen, esta tesis estudia el uso de la tecnología fotónica para las redes de sistemas a exascala y propone aprovechar[CA] Els darrers anys, múltiples projectes de recerca a tot el món s'han centrat en el disseny de superordinadors capaços d'assolir la barrera de computació exascala, amb l'objectiu de donar suport a l'execució d'aplicacions importants per a la nostra societat, com ara salut, intel·ligència artificial, meteorologia, etc. Segons la tendència creixent en la potència de càlcul en cada generació de superordinadors, es preveu assolir aquest objectiu en els propers anys. No obstant això, assolir aquest objectiu requereix abordar diferents reptes importants en el disseny i desenvolupament del sistema. Un dels principals és aconseguir comunicacions ràpides i eficients entre l'enorme nombre de nodes computacionals i els sistemes de memòria. La tecnologia fotònica proporciona diversos avantatges respecte a les xarxes elèctriques actuals, com ara un major ample de banda als enllaços, un major paral·lelisme de la xarxa gràcies a DWDM o una millor gestió del cable a causa de la seva mida molt més xicoteta. En la tesi, s'ha desenvolupat un estudi de viabilitat i desenvolupament de xarxes d'interconnexió mitjançant tecnologia fotònica per a futurs sistemes exascala dins del projecte europeu ExaNeSt. En primer lloc, s'ha dut a terme un estudi de caracterització d'aplicacions exascala dels requisits de xarxa. Els resultats de l'anàlisi ajuden a entendre els requisits de xarxa de les aplicacions exascale i, per tant, ens guien en el disseny de la xarxa del sistema. Aquesta anàlisi considera tres paràmetres principals: la distribució dels missatges en funció de la seva mida i tipus, el consum d'ample de banda requerit durant tota l'execució i els patrons de comunicació espacial entre els nodes. L'estudi revela la necessitat d'una xarxa d'interconnexió ràpida i eficient, ja que la majoria de comunicacions consisteixen en ràfegues de transmissions, cadascuna amb una mida mitjana de missatge de 50 KB. A continuació, la tesi se centra a identificar els principals elements que diferencien les xarxes fotòniques de les elèctriques. Identifiquem una seqüència de passos en el disseny i implementació d'un simulador: tractar la tecnologia fotònica des de zero o per ampliar un simulador de xarxa elèctrica existent per modelar la fotònica. Després, es presenten dos estudis principals de comparació de rendiment entre xarxes elèctriques i diferents configuracions de xarxes fotòniques mitjançant topologies clàssiques. En el primer estudi, realitzat tant amb trànsit sintètic com amb traces d'ExaNeSt en un toro, fat tree i dragonfly, vam trobar que la tecnologia fotònica representa una millora notable respecte a la tecnologia elèctrica. A més, l'estudi mostra que el paràmetre que més afecta el rendiment és l'amplada de banda del canal fotònic. Aquest darrer estudi analitza el rendiment d'aplicacions reals en simulacions a gran escala en una topologia jellyfish. Els resultats d'aquest estudi corroboren les conclusions obtingudes en l'anterior, revelant també que la tecnologia fotònica permet reduir la complexitat d'algunes topologies i, per tant, el cost de la xarxa. En els estudis anteriors ens adonem que la xarxa estava infrautilitzada principalment perquè les topologies estudiades per a xarxes elèctriques no aprofiten les característiques proporcionades per la tecnologia fotònica. Per aquest motiu, proposem Segment Switching, una estratègia de commutació destinada a reduir la longitud de les rutes mitjançant la implementació de memòries intermèdies en nodes intermedis al llarg de la ruta. Els resultats experimentals mostren que cadascuna de les topologies estudiades presenta diferents requisits de memòria intermèdia. Per al toro, com més gran siga el nombre de memòries intermèdies a la xarxa, major serà el rendiment. Per al fat tree, el paràmetre clau és la mida de la memòria intermèdia, aconseguint un rendiment similar tant amb una configuració amb memòria intermèdia en tots els co[EN] In the last recent years, multiple research projects around the world have focused on the design of supercomputers able to reach the exascale computing barrier, with the aim of supporting the execution of important applications for our society, such as health, artificial intelligence, meteorology, etc. According to the growing trend in the computational power in each supercomputer generation, this objective is expected to be reached in the coming years. However, achieving this goal requires addressing distinct major challenges in the design and development of the system. One of the main ones is to achieve fast and efficient communications between the huge number of computational nodes and the memory systems. Photonics technology provides several advantages over current electrical networks, such as higher bandwidth in the links, greater network parallelism thanks to DWDM, or better cable management due to its much smaller size. In this thesis, a feasibility study and development of interconnection networks have been developed using photonics technology for future exascale systems within the European project ExaNeSt. First, a characterization study of exascale applications from the network requirements has been carried out. The results of the analysis help understand the network requirements of exascale applications, and thereby guide us in the design of the system network. This analysis considers three main parameters: the distribution of the messages based on their size and type, the required bandwidth consumption throughout the execution, and the spatial communication patterns between the nodes. The study reveals the need for a fast and efficient interconnection network, since most communications consist of bursts of transmissions, each with an average message size of 50 KB. Next, this dissertation concentrates on identifying the main elements that differentiate photonic networks from electrical ones. We identify a sequence of steps in the design and implementation of a simulator either i) dealing with photonic technology from scratch or ii) to extend an existing electrical network simulator in order to model photonics. After that, two main performance comparison studies between electrical networks and different configurations of photonic networks are presented using classical topologies. In the former study, carried out with both synthetic traffic and traces of ExaNeSt in a torus, fat tree and dragonfly, we found that photonic technology represents a noticeable improvement over electrical technology. Furthermore, the study shows that the parameter that most affects the performance is the bandwidth of the photonic channel. The latter study analyzes performance of real applications in large-scale simulations in a jellyfish topology. The results of this study corroborates the conclusions obtained in the previous, also revealing that photonic technology allows reducing the complexity of some topologies, and therefore, the cost of the network. In the previous studies we realize that the network was underutilized mainly because the studied topologies for electrical networks do not take advantage of the features provided by photonic technology. For this reason, we propose Segment Switching, a switching strategy aimed at reducing the length of the routes by implementing buffers at intermediate nodes along the path. Experimental results show that each of the studied topologies presents different buffering requirements. For the torus, the higher the number of buffers in the network, the higher the performance. For the fat tree, the key parameter is the buffer size, achieving similar performance a configuration with buffers on all switches that locating buffers only at the top level. In summary, this thesis studies the use of photonic technology for networks of exascale systems, and proposes to take advantage of the characteristics of this technology in current electrical network topologies.This thesis has been conceived from the work carried out by Polytechnic University of Valencia in the ExaNeSt European projectDuro Gómez, J. (2021). Photonic Interconnection Networks for Exascale Computers [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/166796TESI

RiuNet

Energy-Efficient Interconnection Networks for High-Performance Computing

Author: Zahn Felix
Publication venue
Publication date: 01/01/2020
Field of study

In recent years, energy has become one of the most important factors for de- signing and operating large scale computing systems. This is particularly true in high-performance computing, where systems often consist of thousands of nodes. Especially after the end of Dennard’s scaling, the demand for energy- proportionality in components, where energy is depending linearly on utilization, increases continuously. As the main contributor to the overall power consumption, processors have received the main attention so far. The increasing energy proportionality of processors, however, shifts the focus to other components such as interconnection networks. Their share of the overall power consumption is expected to increase to 20% or more while other components further increase their efficiency in the near future. Hence, it is crucial to improve energy proportionality in interconnection networks likewise to reduce overall power and energy consumption. To facilitate these attempts, this work provides comprehensive studies about energy saving in interconnection networks at different levels. First, interconnection networks differ fundamentally from other components in their underlying technology. To gain a deeper understanding of these differences and to identify targets for energy savings, this work provides a detailed power analysis of current network hardware. Furthermore, various applications at different scales are analyzed regarding their communication patterns and locality properties. The findings show that communication makes up only a small fraction of the execution time and networks are actually idling most of the time. Another observation is that point-to-point communication often only occurs within various small subsets of all participants, which indicates that a coordinated mapping could further decrease network traffic. Based on these studies, three different energy-saving policies are designed, which all differ in their implementation and focus. Then, these policies are evaluated in an event-based, power-aware network simulator. While two policies that operate completely local at link level, enable significant energy savings of more than 90% in most analyses, the hybrid one does not provide further benefits despite significant additional design effort. Additionally, these studies include network design parameters, such as transition time between different link configurations, as well as the three most common topologies in supercomputing systems. The final part of this work addresses the interactions of congestion management and energy-saving policies. Although both network management strategies aim for different goals and use opposite approaches, they complement each other and can increase energy efficiency in all studies as well as improve the performance overhead as opposed to plain energy saving

Heidelberger Dokumentenserver

Silicon Photonic Flex-LIONS for Bandwidth-Reconfigurable Optical Interconnects

Author: Fotouhi P
Liu G
Lu H
Proietti R
Werner S
Xiao X
Yoo S.J.B
Zhang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

This paper reports the first experimental demonstration of silicon photonic (SiPh) Flex-LIONS, a bandwidth-reconfigurable SiPh switching fabric based on wavelength routing in arrayed waveguide grating routers (AWGRs) and space switching. Compared with the state-of-the-art bandwidth-reconfigurable switching fabrics, Flex-LIONS architecture exhibits 21× less number of switching elements and 2.9× lower on-chip loss for 64 ports, which indicates significant improvements in scalability and energy efficiency. System experimental results carried out with an 8-port SiPh Flex-LIONS prototype demonstrate error-free one-to-eight multicast interconnection at 25 Gb/s and bandwidth reconfiguration from 25 Gb/s to 100 Gb/s between selected input and output ports. Besides, benchmarking simulation results show that Flex-LIONS can provide a 1.33× reduction in packet latency and >1.5× improvements in energy efficiency when replacing the core layer switches of Fat-Tree topologies with Flex-LIONS. Finally, we discuss the possibility of scaling Flex-LIONS up to N = 1024 ports (N = M × W) by arranging M^2 W-port Flex-LIONS in a Thin-CLOS architecture using W wavelengths

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Modeling and Analysis of the Performance of Exascale Photonic Networks

Author: Duro-Gómez José
Gómez Requena María Engracia
Pascual Pérez José Antonio
Petit Martí Salvador Vicente
Sahuquillo Borrás Julio
Publication venue: 'Wiley'
Publication date: 01/01/2018
Field of study

"This is the peer reviewed version of the following article: Duro, José, Jose A. Pascual, Salvador Petit, Julio Sahuquillo, and María E. Gómez. 2018. Modeling and Analysis of the Performance of Exascale Photonic Networks. Concurrency and Computation: Practice and Experience 31 (21). Wiley. doi:10.1002/cpe.4773, which has been published in final form at https://doi.org/10.1002/cpe.4773. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving."[EN] Photonics technology has become a promising and viable alternative for both on-chip and off-chip interconnection networks of future Exascale systems. Nevertheless, this technology is not mature enough yet in this context, so research efforts focusing on photonic networks are still required to achieve realistic suitable network implementations. In this regard, system-level photonic network simulators can help guide designers to assess the multiple design choices. Most current research is done on electrical network simulators, whose components work widely different from photonics components. In this work, we summarize and compare the working behavior of both technologies which includes the use of optical routers, wavelength-division multiplexing and circuit switching among others. After implementing them into a well-known simulation framework, an extensive simulation study has been carried out using realistic photonic network configurations with synthetic and realistic traffic. Experimental results show that, compared to electrical networks, optical networks can reduce the execution time of the studied real workloads in almost one order of magnitude. Our study also reveals that the photonic configuration highly impacts on the network performance, being the bandwidth per channel and the message length the most important parameters.This work was supported by the ExaNeSt project, funded by the European Union's Horizon 2020 Research and Innovation Program under grant 671553, and by the Spanish Ministerio de Economía y Competitividad (MINECO) and Plan E funds under grant TIN2015-66972-C5-1-R. Pascual was supported by a HiPEAC Collaboration Grant.Duro-Gómez, J.; Pascual Pérez, JA.; Petit Martí, SV.; Sahuquillo Borrás, J.; Gómez Requena, ME. (2019). Modeling and Analysis of the Performance of Exascale Photonic Networks. Concurrency and Computation Practice and Experience. 31(21):1-12. https://doi.org/10.1002/cpe.4773S1123121Top500 website. Accessed January2018.Kodi, A. K., Neel, B., & Brantley, W. C. (2014). Photonic Interconnects for Exascale and Datacenter Architectures. IEEE Micro, 34(5), 18-30. doi:10.1109/mm.2014.62Rumley, S., Nikolova, D., Hendry, R., Li, Q., Calhoun, D., & Bergman, K. (2015). Silicon Photonics for Exascale Systems. Journal of Lightwave Technology, 33(3), 547-562. doi:10.1109/jlt.2014.2363947Shacham, A., Bergman, K., & Carloni, L. P. (2008). Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors. IEEE Transactions on Computers, 57(9), 1246-1260. doi:10.1109/tc.2008.78Batten, C., Joshi, A., Orcutt, J., Khilo, A., Moss, B., Holzwarth, C. W., … Asanovic, K. (2009). Building Many-Core Processor-to-DRAM Networks with Monolithic CMOS Silicon Photonics. IEEE Micro, 29(4), 8-21. doi:10.1109/mm.2009.60WernerS NavaridasJ LujánM.Designing low‐power low‐latency networks‐on‐chip by optimally combining electrical and optical links. Paper presented at: 23rd IEEE International Symposium on High Performance Computer Architecture (HPCA);2016;Austin TX.PucheJ LechagoS PetitS GómezME SahuquilloJ.Accurately modeling a photonic NoC in a detailed CMP simulation framework. Paper presented at: 2016 International Conference on High Performance Computing & Simulation (HPCS);2016;Innsbruck Austria.ChenG ChenH HaurylauM et al.On‐chip copper‐based vs. optical interconnects: delay uncertainty latency power and bandwidth density comparative predictions. Paper presented at: 2006 International Interconnect Technology Conference;2006;Burlingame CA.KatevenisM ChrysosN MarazakisM et al.The ExaNeSt project: Interconnects storage and packaging for exascale systems. Paper presented at: 2016 Euromicro Conference on Digital System Design (DSD);2016;Limassol Cyprus.ConcattoC PascualJA NavaridasJ et al.A CAM-Free Exascalable HPC Router for Low-Energy Communications. Paper presented at: 31st International Conference on Architecture of Computing Systems (ARCS);2018.DuanG‐H FedeliJ‐M KeyvaniniaS ThomsonD et al.10 Gb/s integrated tunable hybrid III‐V/si laser and silicon Mach‐Zehnder modulator. Paper presented at: European Conference and Exhibition on Optical Communications;2012;Amsterdam The Netherlands.DuanGH JanyC Le LiepvreAL et al.Integrated hybrid III‐V/si laser and transmitter. Paper presented at: 2012 International Conference on Indium Phosphide and Related Materials;2012;Santa Barbara CA.Soref, R., & Bennett, B. (1987). Electrooptical effects in silicon. IEEE Journal of Quantum Electronics, 23(1), 123-129. doi:10.1109/jqe.1987.1073206Liu, A., Liao, L., Rubin, D., Nguyen, H., Ciftcioglu, B., Chetrit, Y., … Paniccia, M. (2007). High-speed optical modulation based on carrier depletion in a silicon waveguide. Optics Express, 15(2), 660. doi:10.1364/oe.15.000660Thomson, D. J., Gardes, F. Y., Hu, Y., Mashanovich, G., Fournier, M., Grosse, P., … Reed, G. T. (2011). High contrast 40Gbit/s optical modulation in silicon. Optics Express, 19(12), 11507. doi:10.1364/oe.19.011507Bergman, K., Carloni, L. P., Biberman, A., Chan, J., & Hendry, G. (2014). Photonic Network-on-Chip Design. Integrated Circuits and Systems. doi:10.1007/978-1-4419-9335-9Dong, P., Chen, L., Xie, C., Buhl, L. L., & Chen, Y.-K. (2012). 50-Gb/s silicon quadrature phase-shift keying modulator. Optics Express, 20(19), 21181. doi:10.1364/oe.20.021181DongP LiuX SethumadhavanC et al.224‐Gb/s PDM‐16‐QAM modulator and receiver based on silicon photonic integrated circuits. Paper presented at: Optical Fiber Communication Conference/National Fiber Optic Engineers Conference;2013;Anaheim CA.Navaridas, J., Miguel-Alonso, J., Pascual, J. A., & Ridruejo, F. J. (2011). Simulating and evaluating interconnection networks with INSEE. Simulation Modelling Practice and Theory, 19(1), 494-515. doi:10.1016/j.simpat.2010.08.008Lu, L., Zhao, S., Zhou, L., Li, D., Li, Z., Wang, M., … Chen, J. (2016). 16 × 16 non-blocking silicon optical switch based on electro-optic Mach-Zehnder interferometers. Optics Express, 24(9), 9295. doi:10.1364/oe.24.009295DuroJ PetitS SahuquilloJ GómezME.Modeling a photonic network for exascale computing. Paper presented at: 2017 International Conference on High Performance Computing & Simulation (HPCS);2017;Genoa Italy.Xi, K., Kao, Y.-H., & Chao, H. J. (2012). A Petabit Bufferless Optical Switch for Data Center Networks. Optical Interconnects for Future Data Center Networks, 135-154. doi:10.1007/978-1-4614-4630-9_8KimJ DallyWJ ScottS AbtsD.Technology‐driven highly‐scalable dragonfly topology. Paper presented at: 35th International Symposium on Computer Architecture (ISCA);2008;Beijing China.Essiambre, R.-J., & Tkach, R. W. (2012). Capacity Trends and Limits of Optical Communication Networks. Proceedings of the IEEE, 100(5), 1035-1055. doi:10.1109/jproc.2012.2182970Temprana, E., Myslivets, E., Kuo, B. P.-P., Liu, L., Ataie, V., Alic, N., & Radic, S. (2015). Overcoming Kerr-induced capacity limit in optical fiber transmission. Science, 348(6242), 1445-1448. doi:10.1126/science.aab1781Springel, V. (2005). The cosmological simulation code gadget-2. Monthly Notices of the Royal Astronomical Society, 364(4), 1105-1134. doi:10.1111/j.1365-2966.2005.09655.xPlimpton, S. (1995). Fast Parallel Algorithms for Short-Range Molecular Dynamics. Journal of Computational Physics, 117(1), 1-19. doi:10.1006/jcph.1995.1039Ben‐ItzhakY ZahaviE CidonI KolodnyA.HNOCS: Modular open‐source simulator for heterogeneous NoCs. Paper presented at: 2012 International Conference on Embedded Computer Systems (SAMOS);2012;Samos Greece.HossainH AhmedM Al‐NayeemA IslamTZ AkbarMM.Gpnocsim‐a general purpose simulator for network‐on‐chip. Paper presented at: 2007 International Conference on Information and Communication Technology;2007;Dhaka Bangladesh.JainL Al‐HashimiB GaurMS LaxmiV NarayananA.NIRGAM: A simulator for NoC interconnect routing and application modeling. Paper presented at: Design Automation and Test in Europe Conference;2007;Nice France.ChanJ HendryG BibermanA BergmanK CarloniLP.PhoenixSim: A simulator for physical‐layer analysis of chip‐scale photonic interconnection networks. In: Proceedings of the Conference on Design Automation and Test in Europe;2010;Dresden Germany.RumleyS BahadoriM WenK NikolovaD BergmanK.PhoenixSim: crosslayer design and modeling of silicon photonic interconnects. In: Proceedings of the 1st International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems;2016;Prague Czech Republic.VargaA HornigR.An overview of the OMNeT++ simulation environment. In: Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications Networks and Systems & Workshops;2008;Marseille France.SunC ChenCHO KurianG et al.DSENT‐a tool connecting emerging photonics with electronics for opto‐electronic networks‐on‐chip modeling. Paper presented at: 2012 IEEE/ACM Sixth International Symposium on Networks‐on‐Chip;2012;Copenhagen Denmark.Ma, X., Yu, J., Hua, X., Wei, C., Huang, Y., Yang, L., … Yang, J. (2014). LioeSim: A Network Simulator for Hybrid Opto-Electronic Networks-on-Chip Analysis. Journal of Lightwave Technology, 32(22), 4301-4310. doi:10.1109/jlt.2014.2356515KahngAB LiB PehL‐S SamadiK.ORION 2.0: a fast and accurate NoC power and area model for early‐stage design space exploration. In: Proceedings of the Conference on Design Automation and Test in Europe;2009;Nice France.Chan, J., Hendry, G., Bergman, K., & Carloni, L. P. (2011). Physical-Layer Modeling and System-Level Design of Chip-Scale Photonic Interconnection Networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 30(10), 1507-1520. doi:10.1109/tcad.2011.215715

RiuNet

The University of Manchester - Institutional Repository

RAMP: A Flat Nanosecond Optical Network and MPI Operations for Distributed Deep Learning Systems

Author: Benjamin Joshua
Ottino Alessandro
Zervas Georgios
Publication venue
Publication date: 28/11/2022
Field of study

Distributed deep learning (DDL) systems strongly depend on network performance. Current electronic packet switched (EPS) network architectures and technologies suffer from variable diameter topologies, low-bisection bandwidth and over-subscription affecting completion time of communication and collective operations. We introduce a near-exascale, full-bisection bandwidth, all-to-all, single-hop, all-optical network architecture with nanosecond reconfiguration called RAMP, which supports large-scale distributed and parallel computing systems (12.8~Tbps per node for up to 65,536 nodes). For the first time, a custom RAMP-x MPI strategy and a network transcoder is proposed to run MPI collective operations across the optical circuit switched (OCS) network in a schedule-less and contention-less manner. RAMP achieves 7.6-171

\times

speed-up in completion time across all MPI operations compared to realistic EPS and OCS counterparts. It can also deliver a 1.3-16

\times

and 7.8-58

\times

reduction in Megatron and DLRM training time respectively} while offering 42-53

\times

and 3.3-12.4

\times

improvement in energy consumption and cost respectively

arXiv.org e-Print Archive

UCL Discovery

EXPLORING PHOTONIC BENEA SWITCHING FABRICS FOR FUTURE HPC AND DATACENTRES

Author: Kynigos Markos
Publication venue
Publication date: 01/08/2022
Field of study

The University of Manchester - Institutional Repository

Efficient Intra-Rack Resource Disaggregation for HPC Using Co-Packaged DWDM Photonics

Author: Arafa Yehia
Badawy Abdel Hameed
Bergman Keren
Cook Brandon
Dai Liang Yuan
Glick Madeleine
Michelogiannakis George
Shalf John
Wang Yuyang
Publication venue
Publication date: 17/07/2023
Field of study

The diversity of workload requirements and increasing hardware heterogeneity in emerging high performance computing (HPC) systems motivate resource disaggregation. Resource disaggregation allows compute and memory resources to be allocated individually as required to each workload. However, it is unclear how to efficiently realize this capability and cost-effectively meet the stringent bandwidth and latency requirements of HPC applications. To that end, we describe how modern photonics can be co-designed with modern HPC racks to implement flexible intra-rack resource disaggregation and fully meet the bit error rate (BER) and high escape bandwidth of all chip types in modern HPC racks. Our photonic-based disaggregated rack provides an average application speedup of 11% (46% maximum) for 25 CPU and 61% for 24 GPU benchmarks compared to a similar system that instead uses modern electronic switches for disaggregation. Using observed resource usage from a production system, we estimate that an iso-performance intra-rack disaggregated HPC system using photonics would require 4x fewer memory modules and 2x fewer NICs than a non-disaggregated baseline.Comment: 15 pages, 12 figures, 4 tables. Published in IEEE Cluster 202

arXiv.org e-Print Archive

Recommended from our members

Reconfigurable Optically Interconnected Systems

Author: Shen Yiwen
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

With the immense growth of data consumption in today's data centers and high-performance computing systems driven by the constant influx of new applications, the network infrastructure supporting this demand is under increasing pressure to enable higher bandwidth, latency, and flexibility requirements. Optical interconnects, able to support high bandwidth wavelength division multiplexed signals with extreme energy efficiency, have become the basis for long-haul and metro-scale networks around the world, while photonic components are being rapidly integrated within rack and chip-scale systems. However, optical and photonic interconnects are not a direct replacement for electronic-based components. Rather, the integration of optical interconnects with electronic peripherals allows for unique functionalities that can improve the capacity, compute performance and flexibility of current state-of-the-art computing systems. This requires physical layer methodologies for their integration with electronic components, as well as system level control planes that incorporates the optical layer characteristics. This thesis explores various network architectures and the associated control plane, hardware infrastructure, and other supporting software modules needed to integrate silicon photonics and MEMS based optical switching into conventional datacom network systems ranging from intra-data center and high-performance computing systems to the metro-scale layer networks between data centers. In each of these systems, we demonstrate dynamic bandwidth steering and compute resource allocation capabilities to enable significant performance improvements. The key accomplishments of this thesis are as follows. In Part 1, we present high-performance computing network architectures that integrate silicon photonic switches for optical bandwidth steering, enabling multiple reconfigurable topologies that results in significant system performance improvements. As high-performance systems rely on increased parallelism by scaling up to greater numbers of processor nodes, communication between these nodes grows rapidly and the interconnection network becomes a bottleneck to the overall performance of the system. It has been observed that many scientific applications operating on high-performance computing systems cause highly skewed traffic over the network, congesting only a small percentage of the total available links while other links are underutilized. This mismatch of the traffic and the bandwidth allocation of the physical layer network presents the opportunity to optimize the bandwidth resource utilization of the system by using silicon photonic switches to perform bandwidth steering. This allows the individual processors to perform at their maximum compute potential and thereby improving the overall system performance. We show various testbeds that integrates both microring resonator and Mach-Zehnder based silicon photonic switches within Dragonfly and Fat-Tree topology networks built with conventional equipment, and demonstrate 30-60% reduction in execution time of real high-performance benchmark applications. Part 2 presents a flexible network architecture and control plane that enables autonomous bandwidth steering and IT resource provisioning capabilities between metro-scale geographically distributed data centers. It uses a software-defined control plane to autonomously provision both network and IT resources to support different quality of service requirements and optimizes resource utilization under dynamically changing load variations. By actively monitoring both the bandwidth utilization of the network and CPU or memory resources of the end hosts, the control plane autonomously provisions background or dynamic connections with different levels of quality of service using optical MEMS switching, as well as initializing live migrations of virtual machines to consolidate or distribute workload. Together these functionalities provide flexibility and maximize efficiency in processing and transferring data, and enables energy and cost savings by scaling down the system when resources are not needed. An experimental testbed of three data center nodes was built to demonstrate the feasibility of these capabilities. Part 3 presents Lightbridge, a communications platform specifically designed to provide a more seamless integration between processor nodes and an optically switched network. It addresses some of the crucial issues faced by the works presented in the previous chapters related to optical switching. When optical switches perform switching operations, they change the physical topology of the network, and they lack the capability to buffer packets, resulting in certain optical circuits being unavailable. This prompts the question of whether it is safe to transmit packets by end hosts at any given time. Lightbridge was developed to coordinate switching and routing of optical circuits across the network, by having the processors gain information about the current state of the optical network before transmitting packets, and being able to buffer packets when the optical circuit is not available. This part describes details of Lightbridge which is constituted by a loadable Linux kernel module along with other supporting modifications to the Linux kernel in order to achieve the necessary functionalities

Columbia University Academic Commons

Non-minimal adaptive routing for efficient interconnection networks

Author: Benito Hoz Mariano
Publication venue
Publication date: 16/10/2020
Field of study

RESUMEN: La red de interconexión es un concepto clave de los sistemas de computación paralelos. El primer aspecto que define una red de interconexión es su topología. Habitualmente, las redes escalables y eficientes en términos de coste y consumo energético tienen bajo diámetro y se basan en topologías que encaran el límite de Moore y en las que no hay diversidad de caminos mínimos. Una vez definida la topología, quedando implícitamente definidos los límites de rendimiento de la red, es necesario diseñar un algoritmo de enrutamiento que se acerque lo máximo posible a esos límites y debido a la ausencia de caminos mínimos, este además debe explotar los caminos no mínimos cuando el tráfico es adverso. Estos algoritmos de enrutamiento habitualmente seleccionan entre rutas mínimas y no mínimas en base a las condiciones de la red. Las rutas no mínimas habitualmente se basan en el algoritmo de balanceo de carga propuesto por Valiant, esto implica que doblan la longitud de las rutas mínimas y por lo tanto, la latencia soportada por los paquetes se incrementa. En cuanto a la tecnología, desde su introducción en entornos HPC a principios de los años 2000, Ethernet ha sido usado en un porcentaje representativo de los sistemas. Esta tesis introduce una implementación realista y competitiva de una red escalable y sin pérdidas basada en dispositivos de red Ethernet commodity, considerando topologías de bajo diámetro y bajo consumo energético y logrando un ahorro energético de hasta un 54%. Además, propone un enrutamiento sobre la citada arquitectura, en adelante QCN-Switch, el cual selecciona entre rutas mínimas y no mínimas basado en notificaciones de congestión explícitas. Una vez implementada la decisión de enrutar siguiendo rutas no mínimas, se introduce un enrutamiento adaptativo en fuente capaz de adaptar el número de saltos en las rutas no mínimas. Este enrutamiento, en adelante ACOR, es agnóstico de la topología y mejora la latencia en hasta un 28%. Finalmente, se introduce un enrutamiento dependiente de la topología, en adelante LIAN, que optimiza el número de saltos de las rutas no mínimas basado en las condiciones de la red. Los resultados de su evaluación muestran que obtiene una latencia cuasi óptima y mejora el rendimiento de algoritmos de enrutamiento actuales reduciendo la latencia en hasta un 30% y obteniendo un rendimiento estable y equitativo.ABSTRACT: Interconnection network is a key concept of any parallel computing system. The first aspect to define an interconnection network is its topology. Typically, power and cost-efficient scalable networks with low diameter rely on topologies that approach the Moore bound in which there is no minimal path diversity. Once the topology is defined, the performance bounds of the network are determined consequently, so a suitable routing algorithm should be designed to accomplish as much as possible of those limits and, due to the lack of minimal path diversity, it must exploit non-minimal paths when the traffic pattern is adversarial. These routing algorithms usually select between minimal and non-minimal paths based on the network conditions, where the non-minimal paths are built according to Valiant load-balancing algorithm. This implies that these paths double the length of minimal ones and then the latency supported by packets increases. Regarding the technology, from its introduction in HPC systems in the early 2000s, Ethernet has been used in a significant fraction of the systems. This dissertation introduces a realistic and competitive implementation of a scalable lossless Ethernet network for HPC environments considering low-diameter and low-power topologies. This allows for up to 54% power savings. Furthermore, it proposes a routing upon the cited architecture, hereon QCN-Switch, which selects between minimal and non-minimal paths per packet based on explicit congestion notifications instead of credits. Once the miss-routing decision is implemented, it introduces two mechanisms regarding the selection of the intermediate switch to develop a source adaptive routing algorithm capable of adapting the number of hops in the non-minimal paths. This routing, hereon ACOR, is topology-agnostic and improves average latency in all cases up to 28%. Finally, a topology-dependent routing, hereon LIAN, is introduced to optimize the number of hops in the non-minimal paths based on the network live conditions. Evaluations show that LIAN obtains almost-optimal latency and outperforms state-of-the-art adaptive routing algorithms, reducing latency by up to 30.0% and providing stable throughput and fairness.This work has been supported by the Spanish Ministry of Education, Culture and Sports under grant FPU14/02253, the Spanish Ministry of Economy, Industry and Competitiveness under contracts TIN2010-21291-C02-02, TIN2013-46957-C2-2-P, and TIN2013-46957-C2-2-P (AEI/FEDER, UE), the Spanish Research Agency under contract PID2019-105660RBC22/AEI/10.13039/501100011033, the European Union under agreements FP7-ICT-2011- 7-288777 (Mont-Blanc 1) and FP7-ICT-2013-10-610402 (Mont-Blanc 2), the University of Cantabria under project PAR.30.P072.64004, and by the European HiPEAC Network of Excellence through an internship grant supported by the European Union’s Horizon 2020 research and innovation program under grant agreement No. H2020-ICT-2015-687689

UCrea

LIONS: An AWGR-Based Low-Latency Optical Switch for High-Performance Computing and Data Centers

Author: Akella V
Nitta CJ
Proietti R
Ye XH
Yin YW
Yoo SJB
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

This paper discusses the architecture of an arrayed waveguide grating router (AWGR)-based low-latency interconnect optical network switch called LIONS, and its different loopback buffering schemes. A proof of concept is demonstrated with a 4 x 4 experimental testbed. A simulator was developed to model the LIONS architecture and was validated by comparing experimentally obtained statistics such as average end-to-end latency with the results produced by the simulator. Considering the complexity and cost in implementing loopback buffers in LIONS, we propose an all-optical negative acknowledgement (AO-NACK) architecture in order to remove the need for loopback buffers. Simulation results for LIONS with AO-NACK architecture and distributed loopback buffer architecture are compared with the performance of the flattened butterfly electrical switching network

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)