19 research outputs found

    New fault-tolerant routing algorithms for k-ary n-cube networks

    Get PDF
    The interconnection network is one of the most crucial components in a multicomputer as it greatly influences the overall system performance. Networks belonging to the family of k-ary n-cubes (e.g., tori and hypercubes) have been widely adopted in practical machines due to their desirable properties, including a low diameter, symmetry, regularity, and ability to exploit communication locality found in many real-world parallel applications. A routing algorithm specifies how a message selects a path to cross from source to destination, and has great impact on network performance. Routing in fault-free networks has been extensively studied in the past. As the network size scales up the probability of processor and link failure also increases. It is therefore essential to design fault-tolerant routing algorithms that allow messages to reach their destinations even in the presence of faulty components (links and nodes). Although many fault-tolerant routing algorithms have been proposed for common multicomputer networks, e.g. hypercubes and meshes, little research has been devoted to developing fault-tolerant routing for well-known versions of k-ary n-cubes, such as 2 and 3- dimensional tori. Previous work on fault-tolerant routing has focused on designing algorithms with strict conditions imposed on the number of faulty components (nodes and links) or their locations in the network. Most existing fault-tolerant routing algorithms have assumed that a node knows either only the status of its neighbours (such a model is called local-information-based) or the status of all nodes (global-information-based). The main challenge is to devise a simple and efficient way of representing limited global fault information that allows optimal or near-optimal fault-tolerant routing. This thesis proposes two new limited-global-information-based fault-tolerant routing algorithms for k-ary n-cubes, namely the unsafety vectors and probability vectors algorithms. While the first algorithm uses a deterministic approach, which has been widely employed by other existing algorithms, the second algorithm is the first that uses probability-based fault- tolerant routing. These two algorithms have two important advantages over those already existing in the relevant literature. Both algorithms ensure fault-tolerance under relaxed assumptions, regarding the number of faulty components and their locations in the network. Furthermore, the new algorithms are more general in that they can easily be adapted to different topologies, including those that belong to the family of k-ary n-cubes (e.g. tori and hypercubes) and those that do not (e.g., generalised hypercubes and meshes). Since very little work has considered fault-tolerant routing in k-ary n-cubes, this study compares the relative performance merits of the two proposed algorithms, the unsafety and probability vectors, on these networks. The results reveal that for practical number of faulty nodes, both algorithms achieve good performance levels. However, the probability vectors algorithm has the advantage of being simpler to implement. Since previous research has focused mostly on the hypercube, this study adapts the new algorithms to the hypercube in order to conduct a comparative study against the recently proposed safety vectors algorithm. Results from extensive simulation experiments demonstrate that our algorithms exhibit superior performance in terms of reachability (chances of a message reaching its destination), deviation from optimality (average difference between minimum distance and actual routing distance), and looping (chances of a message continuously looping in the network without reaching destination) to the safety vectors

    Fault-tolerant adaptive and minimal routing in mesh-connected multicomputers using extended safety levels

    Full text link

    New Fault Tolerant Multicast Routing Techniques to Enhance Distributed-Memory Systems Performance

    Get PDF
    Distributed-memory systems are a key to achieve high performance computing and the most favorable architectures used in advanced research problems. Mesh connected multicomputer are one of the most popular architectures that have been implemented in many distributed-memory systems. These systems must support communication operations efficiently to achieve good performance. The wormhole switching technique has been widely used in design of distributed-memory systems in which the packet is divided into small flits. Also, the multicast communication has been widely used in distributed-memory systems which is one source node sends the same message to several destination nodes. Fault tolerance refers to the ability of the system to operate correctly in the presence of faults. Development of fault tolerant multicast routing algorithms in 2D mesh networks is an important issue. This dissertation presents, new fault tolerant multicast routing algorithms for distributed-memory systems performance using wormhole routed 2D mesh. These algorithms are described for fault tolerant routing in 2D mesh networks, but it can also be extended to other topologies. These algorithms are a combination of a unicast-based multicast algorithm and tree-based multicast algorithms. These algorithms works effectively for the most commonly encountered faults in mesh networks, f-rings, f-chains and concave fault regions. It is shown that the proposed routing algorithms are effective even in the presence of a large number of fault regions and large size of fault region. These algorithms are proved to be deadlock-free. Also, the problem of fault regions overlap is solved. Four essential performance metrics in mesh networks will be considered and calculated; also these algorithms are a limited-global-information-based multicasting which is a compromise of local-information-based approach and global-information-based approach. Data mining is used to validate the results and to enlarge the sample. The proposed new multicast routing techniques are used to enhance the performance of distributed-memory systems. Simulation results are presented to demonstrate the efficiency of the proposed algorithms

    Performance analysis of wormhole routing in multicomputer interconnection networks

    Get PDF
    Perhaps the most critical component in determining the ultimate performance potential of a multicomputer is its interconnection network, the hardware fabric supporting communication among individual processors. The message latency and throughput of such a network are affected by many factors of which topology, switching method, routing algorithm and traffic load are the most significant. In this context, the present study focuses on a performance analysis of k-ary n-cube networks employing wormhole switching, virtual channels and adaptive routing, a scenario of especial interest to current research. This project aims to build upon earlier work in two main ways: constructing new analytical models for k-ary n-cubes, and comparing the performance merits of cubes of different dimensionality. To this end, some important topological properties of k-ary n-cubes are explored initially; in particular, expressions are derived to calculate the number of nodes at/within a given distance from a chosen centre. These results are important in their own right but their primary significance here is to assist in the construction of new and more realistic analytical models of wormhole-routed k-ary n-cubes. An accurate analytical model for wormhole-routed k-ary n-cubes with adaptive routing and uniform traffic is then developed, incorporating the use of virtual channels and the effect of locality in the traffic pattern. New models are constructed for wormhole k-ary n-cubes, with the ability to simulate behaviour under adaptive routing and non-uniform communication workloads, such as hotspot traffic, matrix-transpose and digit-reversal permutation patterns. The models are equally applicable to unidirectional and bidirectional k-ary n-cubes and are significantly more realistic than any in use up to now. With this level of accuracy, the effect of each important network parameter on the overall network performance can be investigated in a more comprehensive manner than before. Finally, k-ary n-cubes of different dimensionality are compared using the new models. The comparison takes account of various traffic patterns and implementation costs, using both pin-out and bisection bandwidth as metrics. Networks with both normal and pipelined channels are considered. While previous similar studies have only taken account of network channel costs, our model incorporates router costs as well thus generating more realistic results. In fact the results of this work differ markedly from those yielded by earlier studies which assumed deterministic routing and uniform traffic, illustrating the importance of using accurate models to conduct such analyses

    Parallelizing Timed Petri Net simulations

    Get PDF
    The possibility of using parallel processing to accelerate the simulation of Timed Petri Nets (TPN's) was studied. It was recognized that complex system development tools often transform system descriptions into TPN's or TPN-like models, which are then simulated to obtain information about system behavior. Viewed this way, it was important that the parallelization of TPN's be as automatic as possible, to admit the possibility of the parallelization being embedded in the system design tool. Later years of the grant were devoted to examining the problem of joint performance and reliability analysis, to explore whether both types of analysis could be accomplished within a single framework. In this final report, the results of our studies are summarized. We believe that the problem of parallelizing TPN's automatically for MIMD architectures has been almost completely solved for a large and important class of problems. Our initial investigations into joint performance/reliability analysis are two-fold; it was shown that Monte Carlo simulation, with importance sampling, offers promise of joint analysis in the context of a single tool, and methods for the parallel simulation of general Continuous Time Markov Chains, a model framework within which joint performance/reliability models can be cast, were developed. However, very much more work is needed to determine the scope and generality of these approaches. The results obtained in our two studies, future directions for this type of work, and a list of publications are included

    The 2nd Conference of PhD Students in Computer Science

    Get PDF

    Selected topics in robotics for space exploration

    Get PDF
    Papers and abstracts included represent both formal presentations and experimental demonstrations at the Workshop on Selected Topics in Robotics for Space Exploration which took place at NASA Langley Research Center, 17-18 March 1993. The workshop was cosponsored by the Guidance, Navigation, and Control Technical Committee of the NASA Langley Research Center and the Center for Intelligent Robotic Systems for Space Exploration (CIRSSE) at RPI, Troy, NY. Participation was from industry, government, and other universities with close ties to either Langley Research Center or to CIRSSE. The presentations were very broad in scope with attention given to space assembly, space exploration, flexible structure control, and telerobotics

    A Hardware Verification Methodology for an Interconnection Network with fast Process Synchronization

    Full text link
    Shrinking process node sizes allow the integration of more and more functionality into a single chip design. At the same time, the mask costs to manufacture a new chip increases steadily. For the industry this cost increase can be absorbed by selling more chips. Furthermore, new innovative chip designs have a higher risk. Therefore, the industry only changes small parts of a chip design between different generations to minimize their risks. Thus, new innovative chip designs can only be realized by research institutes, which do not have the cost restrictions and the pressure from the markets as the industry. Such an innovative research project is EXTOLL, which is developed by the Computer Architecture Group of the University of Heidelberg. It is a new interconnection network for High performance Computing, and targets the problems of existing interconnection networks commercially available. EXTOLL is optimized for a high bandwidth, a low latency, and a high message rate. Especially, the low latency and high message rate become more important for modern interconnection networks. As the size of networks grow, the same computational problem is distributed to more nodes. This leads to a lower data granularity and more smaller messages, that have to be transported by the interconnection network. The problem of smaller messages in the interconnection network is addressed by this thesis. It develops a new network protocol, which is optimized for small messages. It reduces the protocol overhead required for sending small messages. Furthermore, the growing network sizes introduce a reliability problem. This is also addressed by the developed efficient network protocol. The smaller data granularity also increases the need for an efficient barrier synchronization. Such a hardware barrier synchronization is developed by thesis, using a new approach of integrating the barrier functionality into the interconnection network. The masks costs to manufacture an ASIC make it difficult for a research institute to build an ASIC. A research institute cannot afford re-spin, because of the costs. Therefore, there is the pressure to make it right the first time. An approach to avoid a re-spin is the functional verification in prior to the submission. A complete and comprehensive verification methodology is developed for the EXTOLL interconnection network. Due to the structured approach, it is possible to realize the functional verification with limited resources in a small time frame. Additionally, the developed verification methodology is able to support different target technologies for the design with a very little overhead

    Efficient Passive Clustering and Gateways selection MANETs

    Get PDF
    Passive clustering does not employ control packets to collect topological information in ad hoc networks. In our proposal, we avoid making frequent changes in cluster architecture due to repeated election and re-election of cluster heads and gateways. Our primary objective has been to make Passive Clustering more practical by employing optimal number of gateways and reduce the number of rebroadcast packets

    Interrupt-generating active data objects

    Get PDF
    An investigation is presented into an interrupt-generating object model which is designed to reduce the effort of programming distributed memory multicomputer networks. The object model is aimed at the natural modelling of problem domains in which a number of concurrent entities interrupt one another as they lay claim to shared resources. The proposed computational model provides for the safe encapsulation of shared data, and incorporates inherent arbitration for simultaneous access to the data. It supplies a predicate triggering mechanism for use in conditional synchronization and as an alternative mechanism to polling. Linguistic support for the proposal requires a novel form of control structure which is able to interface sensibly with interrupt-generating active data objects. The thesis presents the proposal as an elemental language structure, with axiomatic guarantees which enforce safety properties and aid in program proving. The established theory of CSP is used to reason about the object model and its interface. An overview is presented of a programming language called HUL, whose semantics reflect the proposed computational model. Using the syntax of HUL, the application of the interrupt-generating active data object is illustrated. A range of standard concurrent problems is presented to demonstrate the properties of the interrupt-generating computational model. Furthermore, the thesis discusses implementation considerations which enable the model to be mapped precisely onto multicomputer networks, and which sustain the abstract programming level provided by the interrupt-generating active data object in the wider programming structures of HUL
    corecore