604 research outputs found

    Routing on the Channel Dependency Graph:: A New Approach to Deadlock-Free, Destination-Based, High-Performance Routing for Lossless Interconnection Networks

    Get PDF
    In the pursuit for ever-increasing compute power, and with Moore's law slowly coming to an end, high-performance computing started to scale-out to larger systems. Alongside the increasing system size, the interconnection network is growing to accommodate and connect tens of thousands of compute nodes. These networks have a large influence on total cost, application performance, energy consumption, and overall system efficiency of the supercomputer. Unfortunately, state-of-the-art routing algorithms, which define the packet paths through the network, do not utilize this important resource efficiently. Topology-aware routing algorithms become increasingly inapplicable, due to irregular topologies, which either are irregular by design, or most often a result of hardware failures. Exchanging faulty network components potentially requires whole system downtime further increasing the cost of the failure. This management approach becomes more and more impractical due to the scale of today's networks and the accompanying steady decrease of the mean time between failures. Alternative methods of operating and maintaining these high-performance interconnects, both in terms of hardware- and software-management, are necessary to mitigate negative effects experienced by scientific applications executed on the supercomputer. However, existing topology-agnostic routing algorithms either suffer from poor load balancing or are not bounded in the number of virtual channels needed to resolve deadlocks in the routing tables. Using the fail-in-place strategy, a well-established method for storage systems to repair only critical component failures, is a feasible solution for current and future HPC interconnects as well as other large-scale installations such as data center networks. Although, an appropriate combination of topology and routing algorithm is required to minimize the throughput degradation for the entire system. This thesis contributes a network simulation toolchain to facilitate the process of finding a suitable combination, either during system design or while it is in operation. On top of this foundation, a key contribution is a novel scheduling-aware routing, which reduces fault-induced throughput degradation while improving overall network utilization. The scheduling-aware routing performs frequent property preserving routing updates to optimize the path balancing for simultaneously running batch jobs. The increased deployment of lossless interconnection networks, in conjunction with fail-in-place modes of operation and topology-agnostic, scheduling-aware routing algorithms, necessitates new solutions to solve the routing-deadlock problem. Therefore, this thesis further advances the state-of-the-art by introducing a novel concept of routing on the channel dependency graph, which allows the design of an universally applicable destination-based routing capable of optimizing the path balancing without exceeding a given number of virtual channels, which are a common hardware limitation. This disruptive innovation enables implicit deadlock-avoidance during path calculation, instead of solving both problems separately as all previous solutions

    InfiniBand-Based Mechanism to Enhance Multipath QoS in MANETs

    Get PDF
    Mobile Ad-hoc Networks (MANETs), the continuous changes in topology and the big amounts of data exchanged across the network makes it difficult for a single routing algorithm to route data efficiently between nodes. MANETs usually suffer from high packet loss rates and high link failure rates, which also makes it difficult to exchange data in effective and reliable fashion. These challenges usually increase congestion on some links while other links are almost free. In this thesis, we propose a novel mechanism to enhance QoS in multipath routing protocols in MANETs based on the InfiniBand (IB) QoS architecture. The basic idea of our approach is to enhance the path balancing to reduce congestion on overloaded links. This mechanism has enabled us to give critical applications higher priority to send them packet when routing their packets across the network, effectively manage frequent connections and disconnections and thus help reduce link failures and packet loss rates, and reduce the overall power consumption as a consequence of the previous gains. We have tested the scheme on the (IBMGTSim) simulator and achieved significant improvements in QoS parameters compared to two well-known routing protocols: AODV and AOMDV.هناك نوع من الشبكات حيث يكون كل المكونات فيها عبارة عن اجهزة متحركة بدون اي بنية تحتية تسمى "MANET "في هذا النوع من الشبكات تتعاون االجهزة ذاتيا لتحديد الطرق في ما بينها والنها متحركة تقوم هذه االجهزة بحساب اكثر من طريق عو ًضا عن حساب طريق واحد لتقليل من احتمالية فشل في االرسال حيث اذا تم فشل في طريق معينة تبقى الطرق االخرة سليمة. وفي ناحية اخرى ولتنوع اهمية البرامج والخدمات التي توفرها هذه االجهزة هناك ما يسمى "بجودى الخدمات Service of Quality" حيث يقوم المستخدم بوضع اولويات للبرامج والخدمات من استهالك المصادر المتاحة, والطريق الشائعة هي ان يقوم المستخدم بوضع حدود على سرعة استعمال الشبكة من قبل البرامج االقل اهمية لترك المصادر متاحة للبرامج الاكثر المهمة بشكل اكثر وهذا الحل يحتوي على الكثير من المشاكل في هذا النوع من الشبكات, حيث ان مواصفات الطرق غير معروفة وغير ثابتة وقد تحتوي او تتغير الى قيم اقل من الحدود الموضوعة للبرمج الغير مهمة فتتساوى البرامج والخدمات االقل اهمية بالبرامج االكثر اهمية مما يعني فشل في جودة الخدمات. من خالل بحثنا عن حلول ودراسة انواع مختلفة من الشبكات وجدنا نوع من تطبيق جودة الخدمات في نوع الشبكات المسمى بInfiniBand حيث يتم تطبيق جودة الخدمات من خالل تغيير عدد الرسال المبعثة من قبل البرامج, حيث تقوم االجهزة بارسال عدد اكبر من الرسال التابعة للبرامج المهمة مقارنة بعدد الرسال التابعة للبرامج االقل اهمية, ويتم ذلك باستخدام الصفوف, حيث تصطف الرسال من البرامج المهمة بصف يختلف عن الصف الذي يحتوي على رسال البرامج الغير مهمة. هذا الحل له فائدتان مهمتان االولى انه ال يوثر عالطريقة التقليدية ويمكن ان يستخدم معها والفائدة الثانية انه وبخالف الطريقة التقليدية, الطريقة الجديدة ال تتاثر بصفات الطريق المحسوبة او بتغير صفاتها فنسبة عدد الرسال تكون نفسها مهما اختلفت الطرق و صفاتها, بعد تطبيق هذا النوع وجددنا تحسين في كفائة االرسال تصل الى 18 %في جودة التوصيل و 10 %في سرعة الوصول مع العلم ان جودة الخدمات لم تفشل على غرار الطريقة التقليدية

    Adaptive Routing Strategies for Modern High Performance Networks

    Full text link
    Today’s scalable high-performance applications heavily depend on the bandwidth characteristics of their commu-nication patterns. Contemporary multi-stage interconnec-tion networks suffer from network contention which might decrease application performance. Our experiments show that the effective bisection bandwidth of a non-blocking 512-node Clos network is as low as 38 % if the network is routed statically. In this paper, we propose and ana-lyze different adaptive routing schemes for those networks. We chose Myrinet/MX to implement our proposed routing schemes. Our best adaptive routing scheme is able to in-crease the effective bisection bandwidth to 77 % for 512 nodes and 100 % for smaller node counts. Thus, we show that our proposed adaptive routing schemes are able to im-prove network throughput significantly.

    Teichien sogo ketsugomo no tame no sukeraburuna rutingu shuho

    Get PDF

    On the Potential of NoC Virtualization for Multicore Chips

    Full text link

    Software-based fault-tolerant routing algorithm in multidimensional networks

    Get PDF
    Massively parallel computing systems are being built with hundreds or thousands of components such as nodes, links, memories, and connectors. The failure of a component in such systems will not only reduce the computational power but also alter the network's topology. The software-based fault-tolerant routing algorithm is a popular routing to achieve fault-tolerance capability in networks. This algorithm is initially proposed only for two dimensional networks (Suh et al., 2000). Since, higher dimensional networks have been widely employed in many contemporary massively parallel systems; this paper proposes an approach to extend this routing scheme to these indispensable higher dimensional networks. Deadlock and livelock freedom and the performance of presented algorithm, have been investigated for networks with different dimensionality and various fault regions. Furthermore, performance results have been presented through simulation experiments

    On quantifying fault patterns of the mesh interconnect networks

    Get PDF
    One of the key issues in the design of Multiprocessors System-on-Chip (MP-SoCs), multicomputers, and peerto- peer networks is the development of an efficient communication network to provide high throughput and low latency and its ability to survive beyond the failure of individual components. Generally, the faulty components may be coalesced into fault regions, which are classified into convex and concave shapes. In this paper, we propose a mathematical solution for counting the number of common fault patterns in a 2-D mesh interconnect network including both convex (|-shape, | |-shape, ý-shape) and concave (L-shape, Ushape, T-shape, +-shape, H-shape) regions. The results presented in this paper which have been validated through simulation experiments can play a key role when studying, particularly, the performance analysis of fault-tolerant routing algorithms and measure of a network fault-tolerance expressed as the probability of a disconnection
    corecore