1,300 research outputs found

    Multistage Switching Architectures for Software Routers

    Get PDF
    Software routers based on personal computer (PC) architectures are becoming an important alternative to proprietary and expensive network devices. However, software routers suffer from many limitations of the PC architecture, including, among others, limited bus and central processing unit (CPU) bandwidth, high memory access latency, limited scalability in terms of number of network interface cards, and lack of resilience mechanisms. Multistage PC-based architectures can be an interesting alternative since they permit us to i) increase the performance of single software routers, ii) scale router size, iii) distribute packet manipulation and control functionality, iv) recover from single-component failures, and v) incrementally upgrade router performance. We propose a specific multistage architecture, exploiting PC-based routers as switching elements, to build a high-speed, largesize,scalable, and reliable software router. A small-scale prototype of the multistage router is currently up and running in our labs, and performance evaluation is under wa

    Symmetric rearrangeable networks and algorithms

    Get PDF
    A class of symmetric rearrangeable nonblocking networks has been considered in this thesis. A particular focus of this thesis is on Benes networks built with 2 x 2 switching elements. Symmetric rearrangeable networks built with larger switching elements have also being considered. New applications of these networks are found in the areas of System on Chip (SoC) and Network on Chip (NoC). Deterministic routing algorithms used in NoC applications suffer low scalability and slow execution time. On the other hand, faster algorithms are blocking and thus limit throughput. This will be an acceptable trade-off for many applications where achieving ”wire speed” on the on-chip network would require extensive optimisation of the attached devices. In this thesis I designed an algorithm that has much lower blocking probabilities than other suboptimal algorithms but a much faster execution time than deterministic routing algorithms. The suboptimal method uses the looping algorithm in its outermost stages and then in the two distinct subnetworks deeper in the switch uses a fast but suboptimal path search method to find available paths. The worst case time complexity of this new routing method is O(NlogN) using a single processor, which matches the best known results reported in the literature. Disruption of the ongoing communications in this class of networks during rearrangements is an open issue. In this thesis I explored a modification of the topology of these networks which gives rise to what is termed as repackable networks. A repackable topology allows rearrangements of paths without intermittently losing connectivity by breaking the existing communication paths momentarily. The repackable network structure proposed in this thesis is efficient in its use of hardware when compared to other proposals in the literature. As most of the deterministic algorithms designed for Benes networks implement a permutation of all inputs to find the routing tags for the requested inputoutput pairs, I proposed a new algorithm that can work for partial permutations. If the network load is defined as ρ, the mean number of active inputs in a partial permutation is, m = ρN, where N is the network size. This new method is based on mapping the network stages into a set of sub-matrices and then determines the routing tags for each pair of requests by populating the cells of the sub-matrices without creating a blocking state. Overall the serial time complexity of this method is O(NlogN) and O(mlogN) where all N inputs are active and with m < N active inputs respectively. With minor modification to the serial algorithm this method can be made to work in the parallel domain. The time complexity of this routing algorithm in a parallel machine with N completely connected processors is O(log^2 N). With m active requests the time complexity goes down to (logmlogN), which is better than the O(log^2 m + logN), reported in the literature for 2^0.5((log^2 -4logN)^0.5-logN)<= ρ <= 1. I also designed multistage symmetric rearrangeable networks using larger switching elements and implement a new routing algorithm for these classes of networks. The network topology and routing algorithms presented in this thesis should allow large scale networks of modest cost, with low setup times and moderate blocking rates, to be constructed. Such switching networks will be required to meet the bandwidth requirements of future communication networks

    Modeling of Topologies of Interconnection Networks based on Multidimensional Multiplicity

    Get PDF
    Modern SoCs are becoming more complex with the integration of heterogeneous components (IPs). For this purpose, a high performance interconnection medium is required to handle the complexity. Hence NoCs come into play enabling the integration of more IPs into the SoC with increased performance. These NoCs are based on the concept of Interconnection networks used to connect parallel machines. In response to the MARTE RFP of the OMG, a notation of multidimensional multiplicity has been proposed which permits to model repetitive structures and topologies. This report presents a modeling methodology based on this notation that can be used to model a family of Interconnection Networks called Delta Networks which in turn can be used for the construction of NoCs

    A performance model of multicast communication in wormhole-routed networks on-chip

    Get PDF
    Collective communication operations form a part of overall traffic in most applications running on platforms employing direct interconnection networks. This paper presents a novel analytical model to compute communication latency of multicast as a widely used collective communication operation. The novelty of the model lies in its ability to predict the latency of the multicast communication in wormhole-routed architectures employing asynchronous multi-port routers scheme. The model is applied to the Quarc NoC and its validity is verified by comparing the model predictions against the results obtained from a discrete-event simulator developed using OMNET++

    A performance model of communication in the quarc NoC

    Get PDF
    Networks on-chip (NoC) emerged as a promising communication medium for future MPSoC development. To serve this purpose, the NoCs have to be able to efficiently exchange all types of traffic including the collective communications at a reasonable cost. The Quarc NoC is introduced as a NOC which is highly efficient in performing collective communication operations such as broadcast and multicast. This paper presents an introduction to the Quarc scheme and an analytical model to compute the average message latency in the architecture. To validate the model we compare the model latency prediction against the results obtained from discrete-event simulations

    A Complexity Analysis of Smart Pixel Switching Nodes for Photonic Extended Generalized Shuffle Switching Networks

    Get PDF
    This paper studies the architectural tradeoffs found in the use of smart pixels for nodes within photonic switching interconnection networks are discussed. The particular networks of interest within the analysis are strictly nonblocking extended generalized shuffle (EGS) networks. Several performance metrics are defined for the analysis, and the effect of node size on these metrics is studied. Optimum node sizes are defined for each of the performance metrics and system-level limitations are identified

    Quarc: a novel network-on-chip architecture

    Get PDF
    This paper introduces the Quarc NoC, a novel NoC architecture inspired by the Spidergon NoC. The Quarc scheme significantly outperforms the Spidergon NoC through balancing the traffic which is the result of the modifications applied to the topology and the routing elements.The proposed architecture is highly efficient in performing collective communication operations including broadcast and multicast. We present the topology, routing discipline and switch architecture for the Quarc NoC and demonstrate the performance with the results obtained from discrete event simulations

    Crosstalk-Free Scheduling Algorithms for Routing in Optical Multistage Interconnection Networks

    Get PDF
    Multistage Interconnection Networks (MINs) have been used in telecommunication networks for many years. Significant advancement in the optical technology have drawn the idea of optical implementation of MINs as an important optical switching topology to meet the ever increasing demands of high performance computing communication applications for high channel bandwidth and low communication latency. However, dealing with electro-optic switches instead of electronic switches held its own challenges introduced by optics itself. Limited by the properties of optical signals, optical MINs (OMINs) introduce optical crosstalk, as a result of coupling two signals within each switching element. Therefore, it is not possible to route more than one message simultaneously, without optical crosstalk, over a switching element in an OMIN. Reducing the effect of optical crosstalk has been a challenging issue considering trade-offs between performance and hardware and software complexity. To solve optical crosstalk, many scheduling algorithms have been proposed for routing in OMIN based on a solution called the time domain approach, which divides the N optical inputs into several groups such that crosstalk-free connections can be established. It is the objective of the research presented in this thesis to propose a solution that can further optimize and improve the performance of message scheduling for routing in the optical Omega network. Based on Zero algorithms, a Modified Zero algorithm is developed to achieve a crosstalk-free version of the algorithm. Then, the Fast Zero (FastZ) algorithm is proposed, which uses a new concept called the symmetric Conflict Matrix (sCM) as a pre-scheduling technique. Extended from the FastZ algorithms, another three new algorithms called the FastRLP, BRLP and FastBRLP algorithms are developed to achieve different performance goals. Lastly, a comparison is made through simulation between all algorithms developed in this research with previous Zero-based algorithms as well as traditional Heuristic algorithms since equal routing results can be obtained between all algorithms. Through simulation technique, all three FastZ, BRLP and FastBRLP algorithms have shown the best results when the average execution time is considered. The FastRLP and FastBRLP algorithms on the other hand have shown the best results when the average number of passes is considered. It is proven in this thesis that the new approach has by far achieved the best performance among all the algorithms being tested in this researc

    Quarc: a high-efficiency network on-chip architecture

    Get PDF
    The novel Quarc NoC architecture, inspired by the Spidergon scheme is introduced as a NoC architecture that is highly efficient in performing collective communication operations including broadcast and multicast. The efficiency of the Quarc architecture is achieved through balancing the traffic which is the result of the modifications applied to the topology and the routing elements of the Spidergon NoC. This paper provides an ASIC implementation of both architectures using UMCpsilas 0.13 mum CMOS technology and demonstrates an analysis and comparison of the cost and performance between the Quarc and the Spidergon NoCs
    corecore