9,162 research outputs found

    LAPSES: A Recipe for High-Performance Adaptive Router Design

    Get PDF
    Earlier research has shown that adaptive routing can help in improving network performance. However, it has not received adequate attention in commercial routers mainly due to the additional hardware complexity, and the perceived cost and performance degradation that may result from this complexity. These concerns can be mitigated if one can design a cost-effective router that can support adaptive routing. This paper proposes a three step recipe — Look-Ahead routing, intelligent Path Selection, and an Economic Storage implementation, called the LAPSES approach — for cost-effective high performance pipelined adaptive router design. The first step, look-ahead routing, reduces a pipeline stage in the router by making table lookup and arbitration concurrent. Next, three new traffic-sensitive path selection heuristics (LRU, LFU and MAX-CREDIT) are proposed to select one of the available alternate paths. Finally, two techniques for reducing routing table size of the adaptive router are presented. These are called meta-table routing and economical storage. The proposed economical storage needs a routing table with only 9 and 27 entries for two and three dimensional meshes, respectively. All these design ideas are evaluated on a (16 16) mesh network via simulation. A fully adaptive algorithm and various traffic patterns are used to examine the performance benefits. Performance results show that the look-ahead design as well as the path selection heuristics boost network performance, while the economical storage approach turns out to be an ideal choice in comparison to full-table and meta-table options. We believe the router resulting from these three design enhancements can make adaptive routing a viable choice for interconnects.

    Quarc: a high-efficiency network on-chip architecture

    Get PDF
    The novel Quarc NoC architecture, inspired by the Spidergon scheme is introduced as a NoC architecture that is highly efficient in performing collective communication operations including broadcast and multicast. The efficiency of the Quarc architecture is achieved through balancing the traffic which is the result of the modifications applied to the topology and the routing elements of the Spidergon NoC. This paper provides an ASIC implementation of both architectures using UMCpsilas 0.13 mum CMOS technology and demonstrates an analysis and comparison of the cost and performance between the Quarc and the Spidergon NoCs

    Optimisation of Mobile Communication Networks - OMCO NET

    Get PDF
    The mini conference “Optimisation of Mobile Communication Networks” focuses on advanced methods for search and optimisation applied to wireless communication networks. It is sponsored by Research & Enterprise Fund Southampton Solent University. The conference strives to widen knowledge on advanced search methods capable of optimisation of wireless communications networks. The aim is to provide a forum for exchange of recent knowledge, new ideas and trends in this progressive and challenging area. The conference will popularise new successful approaches on resolving hard tasks such as minimisation of transmit power, cooperative and optimal routing

    Exploring Adaptive Implementation of On-Chip Networks

    Get PDF
    As technology geometries have shrunk to the deep submicron regime, the communication delay and power consumption of global interconnections in high performance Multi- Processor Systems-on-Chip (MPSoCs) are becoming a major bottleneck. The Network-on- Chip (NoC) architecture paradigm, based on a modular packet-switched mechanism, can address many of the on-chip communication issues such as performance limitations of long interconnects and integration of large number of Processing Elements (PEs) on a chip. The choice of routing protocol and NoC structure can have a significant impact on performance and power consumption in on-chip networks. In addition, building a high performance, area and energy efficient on-chip network for multicore architectures requires a novel on-chip router allowing a larger network to be integrated on a single die with reduced power consumption. On top of that, network interfaces are employed to decouple computation resources from communication resources, to provide the synchronization between them, and to achieve backward compatibility with existing IP cores. Three adaptive routing algorithms are presented as a part of this thesis. The first presented routing protocol is a congestion-aware adaptive routing algorithm for 2D mesh NoCs which does not support multicast (one-to-many) traffic while the other two protocols are adaptive routing models supporting both unicast (one-to-one) and multicast traffic. A streamlined on-chip router architecture is also presented for avoiding congested areas in 2D mesh NoCs via employing efficient input and output selection. The output selection utilizes an adaptive routing algorithm based on the congestion condition of neighboring routers while the input selection allows packets to be serviced from each input port according to its congestion level. Moreover, in order to increase memory parallelism and bring compatibility with existing IP cores in network-based multiprocessor architectures, adaptive network interface architectures are presented to use multiple SDRAMs which can be accessed simultaneously. In addition, a smart memory controller is integrated in the adaptive network interface to improve the memory utilization and reduce both memory and network latencies. Three Dimensional Integrated Circuits (3D ICs) have been emerging as a viable candidate to achieve better performance and package density as compared to traditional 2D ICs. In addition, combining the benefits of 3D IC and NoC schemes provides a significant performance gain for 3D architectures. In recent years, inter-layer communication across multiple stacked layers (vertical channel) has attracted a lot of interest. In this thesis, a novel adaptive pipeline bus structure is proposed for inter-layer communication to improve the performance by reducing the delay and complexity of traditional bus arbitration. In addition, two mesh-based topologies for 3D architectures are also introduced to mitigate the inter-layer footprint and power dissipation on each layer with a small performance penalty.Siirretty Doriast

    POWAR: Power-Aware Routing in HPC Networks with On/Off Links

    Full text link
    [EN] In order to save energy in HPC interconnection networks, one usual proposal is to switch idle links into a low-power mode after a certain time without any transmission, as IEEE Energy Efficient Ethernet standard proposes. Extending the low-power mode mechanism, we propose POWer-Aware Routing (POWAR), a simple power-aware routing and selection function for fat-tree and torus networks. POWAR adapts the amount of network links that can be used, taking into account the network load, and obtaining great energy savings in the network (55%-65%) and the entire system (9%-10%) with negligible performance overhead.This work has been supported by the Spanish MINECO and European Commission (FEDER funds) under project TIN2015-66972-C5-1-R. Francisco J. Andujar has been partially funded by the Spanish MICINN and by the ERDF program of the European Union: PCAS Project (TIN2017-88614-R), CAPAP-H6 (TIN2016-81840-REDT), and Junta de Castilla y Leon FEDER Grant VA082P17 (PROPHET Project).Andújar-Muñoz, FJ.; Coll, S.; Alonso Díaz, M.; López Rodríguez, PJ.; Martínez-Rubio, J. (2019). POWAR: Power-Aware Routing in HPC Networks with On/Off Links. ACM Transactions on Architecture and Code Optimization. 15(4):1-22. https://doi.org/10.1145/3293445S122154Abts, D., Marty, M. R., Wells, P. M., Klausler, P., & Liu, H. (2010). Energy proportional datacenter networks. Proceedings of the 37th annual international symposium on Computer architecture - ISCA ’10. doi:10.1145/1815961.1816004Adiga, N. R., Blumrich, M. A., Chen, D., Coteus, P., Gara, A., Giampapa, M. E., … Vranas, P. (2005). Blue Gene/L torus interconnection network. IBM Journal of Research and Development, 49(2.3), 265-276. doi:10.1147/rd.492.0265M. Alonso S. Coll J. M. Martínez V. Santonja and P. López. 2015. Power consumption management in fat-tree interconnection networks. Parallel Comput. 48 C (Oct. 2015) 59--80. 10.1016/j.parco.2015.03.007 M. Alonso S. Coll J. M. Martínez V. Santonja and P. López. 2015. Power consumption management in fat-tree interconnection networks. Parallel Comput. 48 C (Oct. 2015) 59--80. 10.1016/j.parco.2015.03.007Marina Alonso, Coll, S., Martínez, J.-M., Santonja, V., López, P., & Duato, J. (2010). Power saving in regular interconnection networks. Parallel Computing, 36(12), 696-712. doi:10.1016/j.parco.2010.08.003Bob Alverson Edwin Froese Larry Kaplan and Duncan Roweth. 2012. Cray XC series network. Cray Inc. White Paper WP-Aries01-1112 (2012). Bob Alverson Edwin Froese Larry Kaplan and Duncan Roweth. 2012. Cray XC series network. Cray Inc. White Paper WP-Aries01-1112 (2012).Anderson, T. E., Owicki, S. S., Saxe, J. B., & Thacker, C. P. (1993). High-speed switch scheduling for local-area networks. ACM Transactions on Computer Systems, 11(4), 319-352. doi:10.1145/161541.161736Andujar, F. J., Villar, J. A., Sanchez, J. L., Alfaro, F. J., & Escudero-Sahuquillo, J. (2015). VEF Traces: A Framework for Modelling MPI Traffic in Interconnection Network Simulators. 2015 IEEE International Conference on Cluster Computing. doi:10.1109/cluster.2015.141Barroso, L. A., & Hölzle, U. (2007). The Case for Energy-Proportional Computing. Computer, 40(12), 33-37. doi:10.1109/mc.2007.443Camacho, J., & Flich, J. (2011). HPC-Mesh: A Homogeneous Parallel Concentrated Mesh for Fault-Tolerance and Energy Savings. 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems. doi:10.1109/ancs.2011.17Chen, D., Parker, J. J., Eisley, N. A., Heidelberger, P., Senger, R. M., Sugawara, Y., … Steinmacher-Burow, B. (2011). The IBM Blue Gene/Q interconnection network and message unit. Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC ’11. doi:10.1145/2063384.2063419Chen, L., & Pinkston, T. M. (2012). NoRD: Node-Router Decoupling for Effective Power-gating of On-Chip Routers. 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. doi:10.1109/micro.2012.33Christensen, K., Reviriego, P., Nordman, B., Bennett, M., Mostowfi, M., & Maestro, J. (2010). IEEE 802.3az: the road to energy efficient ethernet. IEEE Communications Magazine, 48(11), 50-56. doi:10.1109/mcom.2010.5621967Dally, & Seitz. (1987). Deadlock-Free Message Routing in Multiprocessor Interconnection Networks. IEEE Transactions on Computers, C-36(5), 547-553. doi:10.1109/tc.1987.1676939Das, R., Narayanasamy, S., Satpathy, S. K., & Dreslinski, R. G. (2013). Catnap. Proceedings of the 40th Annual International Symposium on Computer Architecture - ISCA ’13. doi:10.1145/2485922.2485950Derradji, S., Palfer-Sollier, T., Panziera, J.-P., Poudes, A., & Atos, F. W. (2015). The BXI Interconnect Architecture. 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects. doi:10.1109/hoti.2015.15Jack Dongarra Hans W. Meuer and Erich Strohmaier. 2018. TOP500 Supercomputer Sites. Retrieved from https://www.top500.org. Jack Dongarra Hans W. Meuer and Erich Strohmaier. 2018. TOP500 Supercomputer Sites. Retrieved from https://www.top500.org.Duato, J. (1993). A new theory of deadlock-free adaptive routing in wormhole networks. IEEE Transactions on Parallel and Distributed Systems, 4(12), 1320-1331. doi:10.1109/71.250114José Duato Sudhakar Yalamanchili and Lionel Ni. 2003. Interconnection Networks. An Engineering Approach. Morgan Kaufmann Publishers Inc. San Francisco CA. José Duato Sudhakar Yalamanchili and Lionel Ni. 2003. Interconnection Networks. An Engineering Approach. Morgan Kaufmann Publishers Inc. San Francisco CA.GALGO 2017. GALGO—Albacete Research Institute of Informatics Supercomputer Center homepage. Retrieved from http://www.i3a.uclm.es/galgo. GALGO 2017. GALGO—Albacete Research Institute of Informatics Supercomputer Center homepage. Retrieved from http://www.i3a.uclm.es/galgo.Greenberg, A., Hamilton, J., Maltz, D. A., & Patel, P. (2008). The cost of a cloud. ACM SIGCOMM Computer Communication Review, 39(1), 68-73. doi:10.1145/1496091.1496103HPCC {n.d.}. HPC Challenge Benchmark. Retrieved from http://icl.cs.utk.edu/hpcc/index.html. HPCC {n.d.}. HPC Challenge Benchmark. Retrieved from http://icl.cs.utk.edu/hpcc/index.html.Hluchyj, M. G., & Karol, M. J. (1988). Queueing in high-performance packet switching. IEEE Journal on Selected Areas in Communications, 6(9), 1587-1597. doi:10.1109/49.12886Koibuchi, M., Otsuka, T., Hiroki Matsutani, & Amano, H. (2009). An on/off link activation method for low-power ethernet in PC clusters. 2009 IEEE International Symposium on Parallel & Distributed Processing. doi:10.1109/ipdps.2009.5161069Phillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., … Schulten, K. (2005). Scalable molecular dynamics with NAMD. Journal of Computational Chemistry, 26(16), 1781-1802. doi:10.1002/jcc.20289Pronk, S., Páll, S., Schulz, R., Larsson, P., Bjelkmar, P., Apostolov, R., … Lindahl, E. (2013). GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics, 29(7), 845-854. doi:10.1093/bioinformatics/btt055Reviriego, P., Hernandez, J., Larrabeiti, D., & Maestro, J. (2009). Performance evaluation of energy efficient ethernet. IEEE Communications Letters, 13(9), 697-699. doi:10.1109/lcomm.2009.090880K. P. Saravanan and P. Carpenter. 2018. PerfBound: Conserving energy with bounded overheads in on/off-based HPC interconnects. IEEE Trans. Comput. (2018) 1--1. 10.1109/TC.2018.2790394 K. P. Saravanan and P. Carpenter. 2018. PerfBound: Conserving energy with bounded overheads in on/off-based HPC interconnects. IEEE Trans. Comput. (2018) 1--1. 10.1109/TC.2018.2790394Saravanan, K. P., Carpenter, P. M., & Ramirez, A. (2013). Power/performance evaluation of energy efficient Ethernet (EEE) for High Performance Computing. 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). doi:10.1109/ispass.2013.6557171Soteriou, V., & Li-Shiuan Peh. (s. f.). Dynamic power management for power optimization of interconnection networks using on/off links. 11th Symposium on High Performance Interconnects, 2003. Proceedings. doi:10.1109/conect.2003.1231472Totoni, E., Jain, N., & Kale, L. V. (2013). Toward Runtime Power Management of Exascale Networks by on/off Control of Links. 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum. doi:10.1109/ipdpsw.2013.191VEF 2017. VEF traces homepage. Retrieved from http://www.i3a.info/VEFtraces. VEF 2017. VEF traces homepage. Retrieved from http://www.i3a.info/VEFtraces

    Low-Memory Techniques for Routing and Fault-Tolerance on the Fat-Tree Topology

    Full text link
    Actualmente, los clústeres de PCs están considerados como una alternativa eficiente a la hora de construir supercomputadores en los que miles de nodos de computación se conectan mediante una red de interconexión. La red de interconexión tiene que ser diseñada cuidadosamente, puesto que tiene una gran influencia sobre las prestaciones globales del sistema. Dos de los principales parámetros de diseño de las redes de interconexión son la topología y el encaminamiento. La topología define la interconexión de los elementos de la red entre sí, y entre éstos y los nodos de computación. Por su parte, el encaminamiento define los caminos que siguen los paquetes a través de la red. Las prestaciones han sido tradicionalmente la principal métrica a la hora de evaluar las redes de interconexión. Sin embargo, hoy en día hay que considerar dos métricas adicionales: el coste y la tolerancia a fallos. Las redes de interconexión además de escalar en prestaciones también deben hacerlo en coste. Es decir, no sólo tienen que mantener su productividad conforme aumenta el tamaño de la red, sino que tienen que hacerlo sin incrementar sobremanera su coste. Por otra parte, conforme se incrementa el número de nodos en las máquinas de tipo clúster, la red de interconexión debe crecer en concordancia. Este incremento en el número de elementos de la red de interconexión aumenta la probabilidad de aparición de fallos, y por lo tanto, la tolerancia a fallos es prácticamente obligatoria para las redes de interconexión actuales. Esta tesis se centra en la topología fat-tree, ya que es una de las topologías más comúnmente usadas en los clústeres. El objetivo de esta tesis es aprovechar sus características particulares para proporcionar tolerancia a fallos y un algoritmo de encaminamiento capaz de equilibrar la carga de la red proporcionando una buena solución de compromiso entre las prestaciones y el coste.Gómez Requena, C. (2010). Low-Memory Techniques for Routing and Fault-Tolerance on the Fat-Tree Topology [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8856Palanci

    Quarc: a novel network-on-chip architecture

    Get PDF
    This paper introduces the Quarc NoC, a novel NoC architecture inspired by the Spidergon NoC. The Quarc scheme significantly outperforms the Spidergon NoC through balancing the traffic which is the result of the modifications applied to the topology and the routing elements.The proposed architecture is highly efficient in performing collective communication operations including broadcast and multicast. We present the topology, routing discipline and switch architecture for the Quarc NoC and demonstrate the performance with the results obtained from discrete event simulations
    corecore