110 research outputs found
New Fault Tolerant Multicast Routing Techniques to Enhance Distributed-Memory Systems Performance
Distributed-memory systems are a key to achieve high performance computing and the most favorable architectures used in advanced research problems. Mesh connected multicomputer are one of the most popular architectures that have been implemented in many distributed-memory systems. These systems must support communication operations efficiently to achieve good performance. The wormhole switching technique has been widely used in design of distributed-memory systems in which the packet is divided into small flits. Also, the multicast communication has been widely used in distributed-memory systems which is one source node sends the same message to several destination nodes. Fault tolerance refers to the ability of the system to operate correctly in the presence of faults. Development of fault tolerant multicast routing algorithms in 2D mesh networks is an important issue. This dissertation presents, new fault tolerant multicast routing algorithms for distributed-memory systems performance using wormhole routed 2D mesh. These algorithms are described for fault tolerant routing in 2D mesh networks, but it can also be extended to other topologies. These algorithms are a combination of a unicast-based multicast algorithm and tree-based multicast algorithms. These algorithms works effectively for the most commonly encountered faults in mesh networks, f-rings, f-chains and concave fault regions. It is shown that the proposed routing algorithms are effective even in the presence of a large number of fault regions and large size of fault region. These algorithms are proved to be deadlock-free. Also, the problem of fault regions overlap is solved. Four essential performance metrics in mesh networks will be considered and calculated; also these algorithms are a limited-global-information-based multicasting which is a compromise of local-information-based approach and global-information-based approach. Data mining is used to validate the results and to enlarge the sample. The proposed new multicast routing techniques are used to enhance the performance of distributed-memory systems. Simulation results are presented to demonstrate the efficiency of the proposed algorithms
Recommended from our members
A survey of routing techniques in store-and-forward and wormhole interconnects.
This paper presents an overview of algorithms for directing messages through networks of varying topology. These are commonly referred to as routing algorithms in the literature that is presented. In addition to providing background on networking terminology and router basics, the paper explains the issues of deadlock and livelock as they apply to routing. After this, there is a discussion of routing algorithms for both store-and-forward and wormhole-switched networks. The paper covers both algorithms that do and do not adapt to conditions in the network. Techniques targeting structured as well as irregular topologies are discussed. Following this, strategies for routing in the presence of faulty nodes and links in the network are described
Network-on-Chip
Limitations of bus-based interconnections related to scalability, latency, bandwidth, and power consumption for supporting the related huge number of on-chip resources result in a communication bottleneck. These challenges can be efficiently addressed with the implementation of a network-on-chip (NoC) system. This book gives a detailed analysis of various on-chip communication architectures and covers different areas of NoCs such as potentials, architecture, technical challenges, optimization, design explorations, and research directions. In addition, it discusses current and future trends that could make an impactful and meaningful contribution to the research and design of on-chip communications and NoC systems
Cost Effective Routing Implementations for On-chip Networks
Arquitecturas de múltiples núcleos como multiprocesadores (CMP) y soluciones multiprocesador para sistemas dentro del chip (MPSoCs) actuales se basan en la eficacia de las redes dentro del chip (NoC) para la comunicación entre los diversos núcleos. Un diseño eficiente de red dentro del chip debe ser escalable y al mismo tiempo obtener valores ajustados de área, latencia y consumo de energÃa. Para diseños de red dentro del chip de propósito general se suele usar topologÃas de malla 2D ya que se ajustan a la distribución del chip. Sin embargo, la aparición de nuevos retos debe ser abordada por los diseñadores. Una mayor probabilidad de defectos de fabricación, la necesidad de un uso optimizado de los recursos para aumentar el paralelismo a nivel de aplicación o la necesidad de técnicas eficaces de ahorro de energÃa, puede ocasionar patrones de irregularidad en las topologÃas. Además, el soporte para comunicación colectiva es una caracterÃstica buscada para abordar con eficacia las necesidades de comunicación de los protocolos de coherencia de caché. En estas condiciones, un encaminamiento eficiente de los mensajes se convierte en un reto a superar.
El objetivo de esta tesis es establecer las bases de una nueva arquitectura para encaminamiento distribuido basado en lógica que es capaz de adaptarse a cualquier topologÃa irregular derivada de una estructura de malla 2D, proporcionando asà una cobertura total para cualquier caso resultado de soportar los retos mencionados anteriormente. Para conseguirlo, en primer lugar, se parte desde una base, para luego analizar una evolución de varios mecanismos, y finalmente llegar a una implementación, que abarca varios módulos para alcanzar el objetivo mencionado anteriormente. De hecho, esta última implementación tiene por nombre eLBDR (effective Logic-Based Distributed Routing). Este trabajo cubre desde el primer mecanismo, LBDR, hasta el resto de mecanismos que han surgido progresivamente.Rodrigo MocholÃ, S. (2010). Cost Effective Routing Implementations for On-chip Networks [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8962Palanci
Doctor of Philosophy
dissertationOver the last decade, cyber-physical systems (CPSs) have seen significant applications in many safety-critical areas, such as autonomous automotive systems, automatic pilot avionics, wireless sensor networks, etc. A Cps uses networked embedded computers to monitor and control physical processes. The motivating example for this dissertation is the use of fault- tolerant routing protocol for a Network-on-Chip (NoC) architecture that connects electronic control units (Ecus) to regulate sensors and actuators in a vehicle. With a network allowing Ecus to communicate with each other, it is possible for them to share processing power to improve performance. In addition, networked Ecus enable flexible mapping to physical processes (e.g., sensors, actuators), which increases resilience to Ecu failures by reassigning physical processes to spare Ecus. For the on-chip routing protocol, the ability to tolerate network faults is important for hardware reconfiguration to maintain the normal operation of a system. Adding a fault-tolerance feature in a routing protocol, however, increases its design complexity, making it prone to many functional problems. Formal verification techniques are therefore needed to verify its correctness. This dissertation proposes a link-fault-tolerant, multiflit wormhole routing algorithm, and its formal modeling and verification using two different methodologies. An improvement upon the previously published fault-tolerant routing algorithm, a link-fault routing algorithm is proposed to relax the unrealistic node-fault assumptions of these algorithms, while avoiding deadlock conservatively by appropriately dropping network packets. This routing algorithm, together with its routing architecture, is then modeled in a process-algebra language LNT, and compositional verification techniques are used to verify its key functional properties. As a comparison, it is modeled using channel-level VHDL which is compiled to labeled Petri-nets (LPNs). Algorithms for a partial order reduction method on LPNs are given. An optimal result is obtained from heuristics that trace back on LPNs to find causally related enabled predecessor transitions. Key observations are made from the comparison between these two verification methodologies
Efficient mechanisms to provide fault tolerance in interconnection networks for pc clusters
Actualmente, los clusters de PC son un alternativa rentable a los computadores paralelos.
En estos sistemas, miles de componentes (procesadores y/o discos duros) se conectan a través de redes de interconexión de altas prestaciones.
Entre las tecnologÃas de red actualmente disponibles para construir clusters, InfiniBand (IBA) ha emergido como un nuevo estándar de interconexión para clusters.
De hecho, ha sido adoptado por muchos de los sistemas más potentes construidos actualmente (lista top500).
A medida que el número de nodos aumenta en estos sistemas, la red de interconexión también crece.
Junto con el aumento del número de componentes la probabilidad de averÃas aumenta dramáticamente, y asÃ, la tolerancia a fallos en el sistema en general, y de la red de interconexión en particular, se convierte en una necesidad.
Desafortunadamente, la mayor parte de las estrategias de encaminamiento tolerantes a fallos propuestas para los computadores masivamente paralelos no pueden ser aplicadas porque el encaminamiento y las transiciones de canal virtual son deterministas en IBA, lo que impide que los paquetes eviten los fallos.
Por lo tanto, son necesarias nuevas estrategias para tolerar fallos.
Por ello, esta tesis se centra en proporcionar los niveles adecuados de tolerancia a fallos a los clusters de PC, y en particular a las redes IBA.
En esta tesis proponemos y evaluamos varios mecanismos adecuados para las redes de interconexión para clusters.
El primer mecanismo para proporcionar tolerancia a fallos en IBA (al que nos referimos como encaminamiento tolerante a fallos basado en transiciones; TFTR) consiste en usar varias rutas disjuntas entre cada par de nodos origen-destino y seleccionar la ruta apropiada en el nodo fuente usando el mecanismo APM proporcionado por IBA.
Consiste en migrar las rutas afectadas por el fallo a las rutas alternativas sin fallos.
Sin embargo, con este fin, es necesario un algoritmo eficiente de encaminamiento capaz de proporcionar suficientesMontañana Aliaga, JM. (2008). Efficient mechanisms to provide fault tolerance in interconnection networks for pc clusters [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/2603Palanci
The Effect Of Hot Spots On The Performance Of Mesh--Based Networks
Direct network performance is affected by different design parameters which include number of virtual channels, number of ports, routing algorithm, switching technique, deadlock handling technique, packet size, and buffer size. Another factor that affects network performance is the traffic pattern. In this thesis, we study the effect of hotspot traffic on system performance. Specifically, we study the effect of hotspot factor, hotspot number, and hot spot location on the performance of mesh-based networks. Simulations are run on two network topologies, both the mesh and torus. We pay more attention to meshes because they are widely used in commercial machines. Comparisons between oblivious wormhole switching and chaotic packet switching are reported. Overall packet switching proved to be more efficient in terms of throughput when compared to wormhole switching. In the case of uniform random traffic, it is shown that the differences between chaotic and oblivious routing are indistinguishable. Networks with low number of hotspots show better performance. As the number of hotspots increases network latency tends to increase. It is shown that when the hotspot factor increases, performance of packet switching is better than that of wormhole switching. It is also shown that the location of hotspots affects network performance particularly with the oblivious routers since their achieved latencies proved to be more vulnerable to changes in the hotspot location. It is also shown that the smaller the size of the network the earlier network saturation occurs. Further, it is shown that the chaos router’s adaptivity is useful in this case. Finally, for tori, performance is not greatly affected by hotspot presence. This is mostly due to the symmetric nature of tori
- …