3 research outputs found

    Routing on the Channel Dependency Graph:: A New Approach to Deadlock-Free, Destination-Based, High-Performance Routing for Lossless Interconnection Networks

    Get PDF
    In the pursuit for ever-increasing compute power, and with Moore's law slowly coming to an end, high-performance computing started to scale-out to larger systems. Alongside the increasing system size, the interconnection network is growing to accommodate and connect tens of thousands of compute nodes. These networks have a large influence on total cost, application performance, energy consumption, and overall system efficiency of the supercomputer. Unfortunately, state-of-the-art routing algorithms, which define the packet paths through the network, do not utilize this important resource efficiently. Topology-aware routing algorithms become increasingly inapplicable, due to irregular topologies, which either are irregular by design, or most often a result of hardware failures. Exchanging faulty network components potentially requires whole system downtime further increasing the cost of the failure. This management approach becomes more and more impractical due to the scale of today's networks and the accompanying steady decrease of the mean time between failures. Alternative methods of operating and maintaining these high-performance interconnects, both in terms of hardware- and software-management, are necessary to mitigate negative effects experienced by scientific applications executed on the supercomputer. However, existing topology-agnostic routing algorithms either suffer from poor load balancing or are not bounded in the number of virtual channels needed to resolve deadlocks in the routing tables. Using the fail-in-place strategy, a well-established method for storage systems to repair only critical component failures, is a feasible solution for current and future HPC interconnects as well as other large-scale installations such as data center networks. Although, an appropriate combination of topology and routing algorithm is required to minimize the throughput degradation for the entire system. This thesis contributes a network simulation toolchain to facilitate the process of finding a suitable combination, either during system design or while it is in operation. On top of this foundation, a key contribution is a novel scheduling-aware routing, which reduces fault-induced throughput degradation while improving overall network utilization. The scheduling-aware routing performs frequent property preserving routing updates to optimize the path balancing for simultaneously running batch jobs. The increased deployment of lossless interconnection networks, in conjunction with fail-in-place modes of operation and topology-agnostic, scheduling-aware routing algorithms, necessitates new solutions to solve the routing-deadlock problem. Therefore, this thesis further advances the state-of-the-art by introducing a novel concept of routing on the channel dependency graph, which allows the design of an universally applicable destination-based routing capable of optimizing the path balancing without exceeding a given number of virtual channels, which are a common hardware limitation. This disruptive innovation enables implicit deadlock-avoidance during path calculation, instead of solving both problems separately as all previous solutions

    A communication-driven routing technique for application-specific NoCs

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s10766-010-0159-9Networks on Chip (NoCs) have been shown as an efficient solution to the complex on-chip communication problems derived from the increasing number of processor cores. One of the key issues in the design of NoCs is the reduction of both area and power dissipation. As a result, two-dimensional meshes have become the preferred topology, since it offers low and constant link delay. Unfortunately, manufacturing defects or even real-time failures often make the resulting topology to become irregular, preventing the use of traditional routing algorithms. This scenario shows the need for topology-agnostic routing algorithms that provide a valid routing solution when applied over any topology. This paper proposes a new communication-driven routing technique that optimizes the network performance for Application-Specific NoCs. This technique combines a flexible, topology-agnostic routing algorithm with a communication-aware mapping technique that matches the traffic generated by the application with the available network bandwidth. Since the mapping technique can be pruned as needed in order to fit either quality function values or time constraints, this technique can be adapted to fit with different computational costs. The evaluation results show that it significantly improves network performance in terms of both latency and power consumption.This work has been jointly supported by the Spanish MICINN, the European Commission FEDER funds, and the University of Valencia under grants Consolider-Ingenio 2010 CSD2006-00046, TIN2009-14475-C04-04, and V_SEGLES_PIE.Tornero, R.; Orduña Huertas, JM.; Mejia, A.; Flich Cardo, J.; Duato Marín, JF. (2011). A communication-driven routing technique for application-specific NoCs. International Journal of Parallel Programming. 39(3):357-374. https://doi.org/10.1007/s10766-010-0159-9S357374393Duato, J., Yalamanchili, S., Ni, L.: Interconnection Networks an Engineering Approach. IEEE Computer Society (2003)Mak, T.S.T., Sedcole, P., Cheung, P.Y.K., Luk, W., Lam, K.P.: A hybrid analog-digital routing network for noc dynamic routing. In: NOCS ’07: Proceedings of the First International Symposium on Networks-on-Chip, pp. 173–182. IEEE Computer Society, Washington, DC, USA (2007)Sancho, J.C., Robles, A., Flich, J., Lopez, P., Duato, J.: Effective methodology for deadlock-free minimal routing in infiniband networks. In: Proceedings of the 2002 International Conference on Parallel Processing. IEEE Computer Society (2002)Skeie, T., Lysne, O., Flich, J., Lopez, P., Robles, A., Duato, J.: Lash-tor: A generic transition-oriented routing algorithm. In: Proceedings of IEEE International Conference on Parallel and Distributed Systems. IEEE Computer Society (2004)Schroeder M.D., Birrell A.D., Burrows M., Murray H., Needham R.M., Rodeheffer T.L.: Autonet: A high-speed, self-configuring local area network using point-to-point links. IEEE J. Sel. Areas Commun. 9(8), 1318–1335 (1991)Sancho, J.C., Robles, A., Duato, J.: A flexible routing scheme for networks of workstations. In: Proceedings of 2000 International Conference on High Performance Computing. IEEE Computer Society (2000)Koibuchi, M., Jouraku, A., Watanabe, K., Amano, H.: Descending layers routing: A deadlock-free deterministic routing using virtual channels in system area networks with irregular topologies. In: Proceedings International Conference on Parallel Processing. IEEE Computer Society (2003)Mejia, A., Flich, J., Duato, J., Reinemo, S., Skeie, T.: Segment-based routing: An efficient fault-tolerant routing algorithm for meshes and tori. In: International Parallel and Distributed Processing Symposium: 20th IPDPS 2006, Rhodos-Grece (2006)Mejia, A., Flich, J., Duato, J.: On the potentials of segment-based routing for nocs. In: ICPP ’08. 37th International Conference on Parallel Processing, 2008, pp. 594–603. (2008)Orduña J., Silla F., Duato J.: On the development of a communication-aware task mapping technique. J. Syst. Archit. 50(4), 207–220 (2004)Tornero, R., Orduña, J.M., Palesi, M., Duato, J.: A communication-aware topological mapping technique for nocs. In: Euro-Par 2008: Proceedings of the 14th International Euro-Par Conference on Parallel Processing. Lecture Notes on Computer Science, vol. 5168, pp. 910–919. Springer, Berlin, Heidelberg (2008)Tornero, R., Orduña, J.M., Mejía, A., Flich, J., Duato, J.: Cart: Communication-aware routing technique for application-specific nocs. In: Fanucci, L. (ed.) 11th EuroMicro Conference on Digital System Design. (2008)Ann Gordon-Ross, N.D., Vahid, F.: Fast configurable-cache tuning with a unified second-level cache. In: International Symposium on Low-Power Electronics and Design, pp. 323–326. (2005)Ascia G., Catania V., Palesi M.: A multi-objective genetic approach for system-level exploration in parameterized systems-on-a-chip. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 24(4), 635–645 (2005)Benini L., Macii A., Macii E., Poncino M., Scarsi R.: Architectures and synthesis algorithms for power-efficient bus interfaces. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 19(9), 969–980 (2000)Palesi, M., Holsmark, R., Kumar, S., Catania, V.: A methodology for design of application specific deadlock-free routing algorithms for noc systems. In: CODES+ISSS ’06: Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis, pp. 142–147. ACM Press, New York, NY, USA (2006)Ascia G., Catania V., Palesi M.: Mapping cores on network–on–chip. Int. J. Comput. Intell. Res. 1(1–2), 109–126 (2005)Hu J., Marculescu R.: Energy- and performance-aware mapping for regular NoC architectures. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 24(4), 551–562 (2005)Noxim, Network-on-Chip simulator. In: http://noxim.sourceforge.ne
    corecore