3,392 research outputs found

    CLEX: Yet Another Supercomputer Architecture?

    Get PDF
    We propose the CLEX supercomputer topology and routing scheme. We prove that CLEX can utilize a constant fraction of the total bandwidth for point-to-point communication, at delays proportional to the sum of the number of intermediate hops and the maximum physical distance between any two nodes. Moreover, % applying an asymmetric bandwidth assignment to the links, all-to-all communication can be realized (1+o(1))(1+o(1))-optimally both with regard to bandwidth and delays. This is achieved at node degrees of nεn^{\varepsilon}, for an arbitrary small constant ε(0,1]\varepsilon\in (0,1]. In contrast, these results are impossible in any network featuring constant or polylogarithmic node degrees. Through simulation, we assess the benefits of an implementation of the proposed communication strategy. Our results indicate that, for a million processors, CLEX can increase bandwidth utilization and reduce average routing path length by at least factors 1010 respectively 55 in comparison to a torus network. Furthermore, the CLEX communication scheme features several other properties, such as deadlock-freedom, inherent fault-tolerance, and canonical partition into smaller subsystems

    ON VULNERABILITY MEASURES OF NETWORKS

    Get PDF
    As links and nodes of interconnection networks are exposed to failures, one of the most important features of a practical networks design is fault tolerance. Vulnerability measures of communication networks are discussed including the connectivities, fault diameters, and measures based on Hosoya-Wiener polynomial. An upper bound for the edge fault diameter of product graphs is proved

    ON VULNERABILITY MEASURES OF NETWORKS

    Get PDF
    As links and nodes of interconnection networks are exposed to failures, one of the most important features of a practical networks design is fault tolerance. Vulnerability measures of communication networks are discussed including the connectivities, fault diameters, and measures based on Hosoya-Wiener polynomial. An upper bound for the edge fault diameter of product graphs is proved

    The edge fault-diameter of Cartesian graph bundles

    Get PDF
    AbstractA Cartesian graph bundle is a generalization of a graph covering and a Cartesian graph product. Let G be a kG-edge connected graph and D̄c(G) be the largest diameter of subgraphs of G obtained by deleting c<kG edges. We prove that D̄a+b+1(G)≤D̄a(F)+D̄b(B)+1 if G is a graph bundle with fibre F over base B, a<kF, and b<kB. As an auxiliary result we prove that the edge-connectivity of graph bundle G is at least kF+kB

    A Resilience Index for Process Safety Analysis

    Get PDF
    PresentationQualitative risk analysis is focused on applying methods to prevent accidents in diverse process plants. The numerical number resulting in the QRA tells nothing about the ability for systems’ recovery if an upset related to safety occurs in the process. Hence a resilience study is required to produce this additional information related to process safety. The resilience index is defined as the proportion of success in recovering the system compared to a number of safety-related upsets. The failure in recovering depends on type and quality of safety barriers, i.e. technology, but also on organizational principles. In this work, Monte Carlo simulation is carried out to estimate the resilience resulting in quantitative resilience estimations. These results provide means to compare processes from a more general safety point of view

    Improving Scalability and Usability of Parallel Runtime Environments for High Availability and High Performance Systems

    Get PDF
    The number of processors embedded in high performance computing platforms is growing daily to solve larger and more complex problems. Hence, parallel runtime environments have to support and adapt to the underlying platforms that require scalability and fault management in more and more dynamic environments. This dissertation aims to analyze, understand and improve the state of the art mechanisms for managing highly dynamic, large scale applications. This dissertation demonstrates that the use of new scalable and fault-tolerant topologies, combined with rerouting techniques, builds parallel runtime environments, which are able to efficiently and reliably deliver sets of information to a large number of processes. Several important graph properties are provided to illustrate the theoretical capability of these topologies in terms of both scalability and fault-tolerance, such as reasonable degree, regular graph, low diameter, symmetric graph, low cost factor, low message traffic density, optimal connectivity, low fault-diameter and strongly resilient. The dissertation builds a communication framework based on these topologies to support parallel runtime environments. Such a framework can handle multiple types of messages, e.g., unicast, multicast, broadcast and all-gather. Additionally, the communication framework has been formally verified to work in both normal and failure circumstances without creating any of the common problems such as broadcast storm, deadlock and non-progress cycle
    corecore