80 research outputs found

    Near-optimal broadcast in all-port wormhole-routed hypercubes using error-correcting codes

    Full text link
    A new broadcasting method is presented for hypercubes with wormhole routing mechanism. The communication model assumed allows an n-dimensional hypercube to have at most n concurrent I/O communication along its ports. It assumes a distance insensitivity of (n + 1) with no intermediate reception capability for the nodes. The approach is based on determination of the set of nodes called stations in the hypercube. Once stations are identified, node disjoint paths are formed from the source to all stations. The broadcasting is accomplished first by sending the message to all stations, which will inform the rest of the nodes. To establish node-disjoint paths between the source node and all stations, we introduce a new routing strategy. We prove that multicasting can be done in one routing step as long as the number of destination nodes are at most n in an n-dimensional hypercube. The number of broadcasting steps using our routing is equal to or smaller than that obtained in an earlier work; this number is optimal for all hypercube dimensions n ≤ 12, except for n = 10

    I/O embedding and broadcasting in star interconnection networks

    Full text link
    The issues of communication between a host or central controller and processors, in large interconnection networks are very important and have been studied in the past by several researchers. There is a plethora of problems that arise when processors are asked to exchange information on parallel computers on which processors are interconnected according to a specific topology. In robust networks, it is desirable at times to send (receive) data/control information to (from) all the processors in minimal time. This type of communication is commonly referred to as broadcasting. To speed up broadcasting in a given network without modifying its topology, certain processors called stations can be specified to act as relay agents. In this thesis, broadcasting issues in a star-based interconnection network are studied. The model adopted assumes all-port communication and wormhole switching mechanism. Initially, the problem treated is one of finding the minimum number of stations required to cover all the nodes in the star graph with i-adjacency. We consider 1-, 2-, and 3-adjacencies and determine the upper bound on the number of stations required to cover the nodes for each case. After deriving the number of stations, two algorithms are designed to broadcast the messages first from the host to stations, and then from stations to remaining nodes; In addition, a Binary-based Algorithm is designed to allow routing in the network by directly working on the binary labels assigned to the star graph. No look-up table is consulted during routing and minimum number of bits are used to represent a node label. At the end, the thesis sheds light on another algorithm for routing using parallel paths in the star network

    Analysis of algorithms for online routing and scheduling in networks

    Get PDF
    We study situations in which an algorithm must make decisions about how to best route and schedule data transfer requests in a communication network before each transfer leaves its source. For some situations, such as those requiring quality of service guarantees, this is essential. For other situations, doing work in advance can simplify decisions in transit and increase the speed of the network. In order to reflect realistic scenarios, we require that our algorithms be online, or make their decisions without knowing future requests. We measure the efficiency of an online algorithm by its competitive ratio, which is the maximum ratio, over all request sequences, of the cost of the online algorithm\u27s solution to that of an optimal solution constructed by knowing all the requests in advance.;We identify and study two distinct variations of this general problem. In the first, data transfer requests are permanent virtual circuit requests in a circuit-switched network and the goal is to minimize the network congestion caused by the route assignment. In the second variation, data transfer requests are packets in a packet-switched network and the goal is to minimize the makespan of the schedule, or the time that the last packet reaches its destination. We present new lower bounds on the competitive ratio of any online algorithm with respect to both network congestion and makespan.;We consider two greedy online algorithms for permanent virtual circuit routing on arbitrary networks with unit capacity links, and prove both lower and upper bounds on their competitive ratios. While these greedy algorithms are not optimal, they can be expected to perform well in many circumstances and require less time to make a decision, when compared to a previously discovered asymptotically optimal online algorithm. For the online packet routing and scheduling problem, we consider an algorithm which simply assigns to each packet a priority based upon its arrival time. No packet is delayed by another packet with a lower priority. We analyze the competitive ratio of this algorithm on linear array, tree, and ring networks

    Optical control plane: theory and algorithms

    Get PDF
    In this thesis we propose a novel way to achieve global network information dissemination in which some wavelengths are reserved exclusively for global control information exchange. We study the routing and wavelength assignment problem for the special communication pattern of non-blocking all-to-all broadcast in WDM optical networks. We provide efficient solutions to reduce the number of wavelengths needed for non-blocking all-to-all broadcast, in the absence of wavelength converters, for network information dissemination. We adopt an approach in which we consider all nodes to be tap-and-continue capable thus studying lighttrees rather than lightpaths. To the best of our knowledge, this thesis is the first to consider “tap-and-continue” capable nodes in the context of conflict-free all-to-all broadcast. The problem of all to-all broadcast using individual lightpaths has been proven to be an NP-complete problem [6]. We provide optimal RWA solutions for conflict-free all-to-all broadcast for some particular cases of regular topologies, namely the ring, the torus and the hypercube. We make an important contribution on hypercube decomposition into edge-disjoint structures. We also present near-optimal polynomial-time solutions for the general case of arbitrary topologies. Furthermore, we apply for the first time the “cactus” representation of all minimum edge-cuts of graphs with arbitrary topologies to the problem of all-to-all broadcast in optical networks. Using this representation recursively we obtain near-optimal results for the number of wavelengths needed by the non-blocking all-to-all broadcast. The second part of this thesis focuses on the more practical case of multi-hop RWA for non- blocking all-to-all broadcast in the presence of Optical-Electrical-Optical conversion. We propose two simple but efficient multi-hop RWA models. In addition to reducing the number of wavelengths we also concentrate on reducing the number of optical receivers, another important optical resource. We analyze these models on the ring and the hypercube, as special cases of regular topologies. Lastly, we develop a good upper-bound on the number of wavelengths in the case of non-blocking multi-hop all-to-all broadcast on networks with arbitrary topologies and offer a heuristic algorithm to achieve it. We propose a novel network partitioning method based on “virtual perfect matching” for use in the RWA heuristic algorithm

    Efficient embedding of virtual hypercubes in irregular WDM optical networks

    Get PDF
    This thesis addresses one of the important issues in designing future WDM optical networks. Such networks are expected to employ an all-optical control plane for dissemination of network state information. It has recently been suggested that an efficient control plane will require non-blocking communication infrastructure and routing techniques. However, the irregular nature of most WDM networks does not lend itself to efficient non-blocking communications. It has been recently shown that hypercubes offer some very efficient non-blocking solutions for, all-to-all broadcast operations, which would be very attractive for control plane implementation. Such results can be utilized by embedding virtual structures in the physical network and doing the routing using properties of a virtual architecture. We will emphasize the hypercube due to its proven usefulness. In this thesis we propose three efficient heuristic methods for embedding a virtual hypercube in an irregular host network such that each node in the host network is either a hypercube node or a neighbor of a hypercube node. The latter will be called a “satellite” or “secondary” node. These schemes follow a step-by-step procedure for the embedding and for finding the physical path implementation of the virtual links while attempting to optimize certain metrics such as the number of wavelengths on each link and the average length of virtual link mappings. We have designed software that takes the adjacency list of an irregular topology as input and provides the adjacency list of a hypercube embedded in the original network. We executed this software on a number of irregular networks with different connectivities and compared the behavior of each of the three algorithms. The algorithms are compared with respect to their performance in trying to optimize several metrics. We also compare our algorithms to an already existing algorithm in the literature

    Quarc: an architecture for efficient on-chip communication

    Get PDF
    The exponential downscaling of the feature size has enforced a paradigm shift from computation-based design to communication-based design in system on chip development. Buses, the traditional communication architecture in systems on chip, are incapable of addressing the increasing bandwidth requirements of future large systems. Networks on chip have emerged as an interconnection architecture offering unique solutions to the technological and design issues related to communication in future systems on chip. The transition from buses as a shared medium to networks on chip as a segmented medium has given rise to new challenges in system on chip realm. By leveraging the shared nature of the communication medium, buses have been highly efficient in delivering multicast communication. The segmented nature of networks, however, inhibits the multicast messages to be delivered as efficiently by networks on chip. Relying on extensive research on multicast communication in parallel computers, several network on chip architectures have offered mechanisms to perform the operation, while conforming to resource constraints of the network on chip paradigm. Multicast communication in majority of these networks on chip is implemented by establishing a connection between source and all multicast destinations before the message transmission commences. Establishing the connections incurs an overhead and, therefore, is not desirable; in particular in latency sensitive services such as cache coherence. To address high performance multicast communication, this research presents Quarc, a novel network on chip architecture. The Quarc architecture targets an area-efficient, low power, high performance implementation. The thesis covers a detailed representation of the building blocks of the architecture, including topology, router and network interface. The cost and performance comparison of the Quarc architecture against other network on chip architectures reveals that the Quarc architecture is a highly efficient architecture. Moreover, the thesis introduces novel performance models of complex traffic patterns, including multicast and quality of service-aware communication

    Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

    Get PDF
    The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

    Performance analysis of wormhole switched interconnection networks with virtual channels and finite buffers

    Get PDF
    An efficient interconnection network that provides high bandwidth and low latency interprocessor communication is critical to harness fully the computational power of large scale multicomputer. K-ary n-cube networks have been widely adopted in contemporary multicomputers due to their desirable properties. As such, the present study focuses on a performance analysis of K-ary n-cubes employing wormhole switching, virtual channels, and adaptive routing. The objective of this dissertation is twofold: to examine the performance of these networks, and to compare the performance merits of various topologies under different working conditions, by means of analytical modelling. Most existing analytical models reported in the literature have used a method originally proposed by Dally to capture the effects of virtual channels on network performance. This method is based on a Markov chain and it has been shown that its prediction accuracy degrades as traffic increases. Moreover, these studies have also constrained the buffer capacity to a single flit per channel, a simplifying assumption that has often been invoked to ease the derivation of the analytical models. Motivated by these observations, the first part of this research proposes a new method for modelling virtual channels, based on an M/G/1 queue. Owing to the generality of this method. Daily's method is shown to be a special case when the message service time is exponentially distributed. The second part of this research uses theoretical results of queuing systems to relax the single-flit buffer assumption. New analytical models are then proposed to capture the effects of deploying arbitrary size buffers on the performance of deterministic and adaptive routing algorithms. Simulation experiments reveal that results from the proposed analytical models are in close agreement with those obtained through simulation. Building on these new analytical models, the third part of this research compares the relative performance merits of K-ary n-cubes under different operating conditions, in the presence of finite size buffers and multiple virtual channels. Namely, the analysis first revisits the relative performance merits of the well-known 2D torus, 3D torus and hypercube under different implementation constraints. The analysis has then been extended to investigate the performance impact of arranging the total buffer space, allocated to a physical channel, into multiple virtual channels. Finally, the performance of adaptive routing has been compared to that of deterministic routing. While previous similar studies have only taken account of channel and router costs, the present analysis incorporates different intra-router delays, as well, and thus generates more realistic results. In fact, the results of this research differ notably from those reported in previous studies, illustrating the sensitivity of such studies to the level of detail, degree of accuracy and the realism of the assumptions adopted

    All optical multicasting in wavelength routing mesh networks with power considerations: design and operation

    Get PDF
    Wavelength routing Wavelength Division Multiplexing (WDM) are optical networks that support all-optical services. They have become the most appealing candidate for wide area backbone networks. Their huge available bandwidth provides the solution for the exponential growth in trayc demands that is due to the increase in the number of users and the surge of more bandwidth intensive network applications and services. A sizable fraction of these applications and services are of multi-point nature. Therefore, supporting multicast service in this network environment is very critical and unique. The all-optical support of various services has advantages, which includes achieving the signal transparency to its content. Nevertheless, the all-optical operational support comes with an associated cost and new issues that make this problem very challenging. In this thesis, we investigate the power-related issues for supporting multicast service in the optical domain, referred to as All-Optical Multicasting (AOM). Our study treats these issues from two networking contexts, namely, Network Provisioning and Connection Provisioning. We propose a number of optimal and heuristic solutions with a unique objective function for each context. In this regard, the objective function for the network provisioning problem is to reduce the network cost, while the solutions for the connection provisioning problem aim to reduce the connection blocking ratio. The optimal formulations are inherently non-linear. However, we introduce novel methods for linearizing them and formulate the problems as Mixed Integer Linear Programs. Also, the design of the heuristic solutions takes into account various optimization factors which results in efficient heuristics that can produce fast solutions that are relatively close to their optimal counterparts, as shown in the numerical results we present

    Parallelizing Timed Petri Net simulations

    Get PDF
    The possibility of using parallel processing to accelerate the simulation of Timed Petri Nets (TPN's) was studied. It was recognized that complex system development tools often transform system descriptions into TPN's or TPN-like models, which are then simulated to obtain information about system behavior. Viewed this way, it was important that the parallelization of TPN's be as automatic as possible, to admit the possibility of the parallelization being embedded in the system design tool. Later years of the grant were devoted to examining the problem of joint performance and reliability analysis, to explore whether both types of analysis could be accomplished within a single framework. In this final report, the results of our studies are summarized. We believe that the problem of parallelizing TPN's automatically for MIMD architectures has been almost completely solved for a large and important class of problems. Our initial investigations into joint performance/reliability analysis are two-fold; it was shown that Monte Carlo simulation, with importance sampling, offers promise of joint analysis in the context of a single tool, and methods for the parallel simulation of general Continuous Time Markov Chains, a model framework within which joint performance/reliability models can be cast, were developed. However, very much more work is needed to determine the scope and generality of these approaches. The results obtained in our two studies, future directions for this type of work, and a list of publications are included
    • …
    corecore