62 research outputs found

    Partial multinode broadcast and partial exchange algorithms for d-dimensional meshes

    Get PDF
    Caption title. "Revision of January 1992."Includes bibliographical references (p. 24-26).Supported by NSF. NSF-ECS-8519058 Supported by ARO. DAAL03-86-K-0171by Emmanouel A. Varvarigos and Dimitri P. Bertsekas

    New fault-tolerant routing algorithms for k-ary n-cube networks

    Get PDF
    The interconnection network is one of the most crucial components in a multicomputer as it greatly influences the overall system performance. Networks belonging to the family of k-ary n-cubes (e.g., tori and hypercubes) have been widely adopted in practical machines due to their desirable properties, including a low diameter, symmetry, regularity, and ability to exploit communication locality found in many real-world parallel applications. A routing algorithm specifies how a message selects a path to cross from source to destination, and has great impact on network performance. Routing in fault-free networks has been extensively studied in the past. As the network size scales up the probability of processor and link failure also increases. It is therefore essential to design fault-tolerant routing algorithms that allow messages to reach their destinations even in the presence of faulty components (links and nodes). Although many fault-tolerant routing algorithms have been proposed for common multicomputer networks, e.g. hypercubes and meshes, little research has been devoted to developing fault-tolerant routing for well-known versions of k-ary n-cubes, such as 2 and 3- dimensional tori. Previous work on fault-tolerant routing has focused on designing algorithms with strict conditions imposed on the number of faulty components (nodes and links) or their locations in the network. Most existing fault-tolerant routing algorithms have assumed that a node knows either only the status of its neighbours (such a model is called local-information-based) or the status of all nodes (global-information-based). The main challenge is to devise a simple and efficient way of representing limited global fault information that allows optimal or near-optimal fault-tolerant routing. This thesis proposes two new limited-global-information-based fault-tolerant routing algorithms for k-ary n-cubes, namely the unsafety vectors and probability vectors algorithms. While the first algorithm uses a deterministic approach, which has been widely employed by other existing algorithms, the second algorithm is the first that uses probability-based fault- tolerant routing. These two algorithms have two important advantages over those already existing in the relevant literature. Both algorithms ensure fault-tolerance under relaxed assumptions, regarding the number of faulty components and their locations in the network. Furthermore, the new algorithms are more general in that they can easily be adapted to different topologies, including those that belong to the family of k-ary n-cubes (e.g. tori and hypercubes) and those that do not (e.g., generalised hypercubes and meshes). Since very little work has considered fault-tolerant routing in k-ary n-cubes, this study compares the relative performance merits of the two proposed algorithms, the unsafety and probability vectors, on these networks. The results reveal that for practical number of faulty nodes, both algorithms achieve good performance levels. However, the probability vectors algorithm has the advantage of being simpler to implement. Since previous research has focused mostly on the hypercube, this study adapts the new algorithms to the hypercube in order to conduct a comparative study against the recently proposed safety vectors algorithm. Results from extensive simulation experiments demonstrate that our algorithms exhibit superior performance in terms of reachability (chances of a message reaching its destination), deviation from optimality (average difference between minimum distance and actual routing distance), and looping (chances of a message continuously looping in the network without reaching destination) to the safety vectors

    Minimum-time multidrop broadcast

    Get PDF
    AbstractThe multidrop communication model assumes that a message originated by a sender is sent along a path in a network and is communicated to each site along that path. In the presence of several concurrent senders, we require that the transmission paths be vertex-disjoint. The time analysis of such communication includes both start-up time and drop-off time terms. We determine the minimum time required to broadcast a message under this communication model in several classes of graphs

    Fault-tolerant adaptive and minimal routing in mesh-connected multicomputers using extended safety levels

    Full text link

    A Message Scheduling Scheme for All-to-All Personalized Communication on Ethernet Switched Clusters

    Full text link

    Performance evaluation of distributed crossbar switch hypermesh

    Get PDF
    The interconnection network is one of the most crucial components in any multicomputer as it greatly influences the overall system performance. Several recent studies have suggested that hypergraph networks, such as the Distributed Crossbar Switch Hypermesh (DCSH), exhibit superior topological and performance characteristics over many traditional graph networks, e.g. k-ary n-cubes. Previous work on the DCSH has focused on issues related to implementation and performance comparisons with existing networks. These comparisons have so far been confined to deterministic routing and unicast (one-to-one) communication. Using analytical models validated through simulation experiments, this thesis extends that analysis to include adaptive routing and broadcast communication. The study concentrates on wormhole switching, which has been widely adopted in practical multicomputers, thanks to its low buffering requirement and the reduced dependence of latency on distance under low traffic. Adaptive routing has recently been proposed as a means of improving network performance, but while the comparative evaluation of adaptive and deterministic routing has been widely reported in the literature, the focus has been on graph networks. The first part of this thesis deals with adaptive routing, developing an analytical model to measure latency in the DCSH, and which is used throughout the rest of the work for performance comparisons. Also, an investigation of different routing algorithms in this network is presented. Conventional k-ary n-cubes have been the underlying topology of contemporary multicomputers, but it is only recently that adaptive routing has been incorporated into such systems. The thesis studies the relative performance merits of the DCSH and k-ary n-cubes under adaptive routing strategy. The analysis takes into consideration real-world factors, such as router complexity and bandwidth constraints imposed by implementation technology. However, in any network, the routing of unicast messages is not the only factor in traffic control. In many situations (for example, parallel iterative algorithms, memory update and invalidation procedures in shared memory systems, global notification of network errors), there is a significant requirement for broadcast traffic. The DCSH, by virtue of its use of hypergraph links, can implement broadcast operations particularly efficiently. The second part of the thesis examines how the DCSH and k-ary n-cube performance is affected by the presence of a broadcast traffic component. In general, these studies demonstrate that because of their relatively high diameter, k-ary n-cubes perform poorly when message lengths are short. This is consistent with earlier more simplistic analyses which led to the proposal for the express-cube, an enhancement of the basic k-ary n-cube structure, which provides additional express channels, allowing messages to bypass groups of nodes along their paths. The final part of the thesis investigates whether this "partial bypassing" can compete with the "total bypassing" capability provided inherently by the DCSH topology

    Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

    Get PDF
    Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM

    A Cross-Layer Study of the Scheduling Problem

    Get PDF
    This thesis is inspired by the need to study and understand the interdependence between the transmission powers and rates in an interference network, and how these two relate to the outcome of scheduled transmissions. A commonly used criterion that relates these two parameters is the Signal to Interference plus Noise Ratio (SINR). Under this criterion a transmission is successful if the SINR exceeds a threshold. The fact that this threshold is an increasing function of the transmission rate gives rise to a fundamental trade-off regarding the amount of time-sharing that must be permitted for optimal performance in accessing the wireless channel. In particular, it is not immediate whether more concurrent activations at lower rates would yield a better performance than less concurrent activations at higher rates. Naturally, the balance depends on the performance objective under consideration. Analyzing this fundamental trade-off under a variety of performance objectives has been the main steering impetus of this thesis. We start by considering single-hop, static networks comprising of a set of always-backlogged sources, each multicasting traffic to its corresponding destinations. We study the problem of joint scheduling and rate control under two performance objectives, namely sum throughput maximization and proportional fairness. Under total throughput maximization, we observe that the optimal policy always activates the multicast source that sustains the highest rate. Under proportional fairness, we explicitly characterize the optimal policy under the assumption that the rate control and scheduling decisions are restricted to activating a single source at any given time or all of them simultaneously. In the sequel, we extend our results in four ways, namely we (i) turn our focus on time-varying wireless networks, (ii) assume policies that have access to only a, perhaps inaccurate, estimate of the current channel state, (iii) consider a broader class of utility functions, and finally (iv) permit all possible rate control and scheduling actions. We introduce an online, gradient-based algorithm under a fading environment that selects the transmission rates at every decision instant by having access to only an estimate of the current channel state so that the total user utility is maximized. In the event that more than one rate allocation is optimal, the introduced algorithm selects the one that minimizes the transmission power sum. We show that this algorithm is optimal among all algorithms that do not have access to a better estimate of the current channel state. Next, we turn our attention to the minimum-length scheduling problem, i.e., instead of a system with saturated sources, we assume that each network source has a finite amount of data traffic to deliver to its corresponding destination in minimum time. We consider both networks with time-invariant as well as time-varying channels under unicast traffic. In the time-invariant (or static) network case we map the problem of finding a schedule of minimum length to finding a shortest path on a Directed Acyclic Graph (DAG). In the time-varying network case, we map the corresponding problem to a stochastic shortest path and we provide an optimal solution through stochastic control methods. Finally, instead of considering a system where sources are always backlogged or have a finite amount of data traffic, we focus on bursty traffic. Our objective is to characterize the stable throughput region of a multi-hop network with a set of commodities of anycast traffic. We introduce a joint scheduling and routing policy, having access to only an estimate of the channel state and further characterize the stable throughput region of the network. We also show that the introduced policy is optimal with respect to maximizing the stable throughput region of the network within a broad class of stationary, non-stationary, and anticipative policies
    • …
    corecore