30 research outputs found

    Topologies for Optical Interconnection Networks Based on the Optical Transpose Interconnection System

    Get PDF
    International audienceMany results exist in the literature describing technological and theoretical advances in optical network topologies and design. However, an essential effort has yet to be done in linking those results together. In this paper, we propose a step in this direction, by giving optical layouts for several graph-theoretical topologies studied in the literature, using the Optical Transpose Interconnection System (OTIS) architecture. These topologies include the family of Partitioned Optical Passive Star (POPS) and stack-Kautz networks as well as a generalization of the Kautz and de Bruijn digraphs

    Theoretical Aspects of the Optical Transpose Interconnecting System Architecture

    Get PDF
    National audienceAn attractive way of implementing efficient local interconnection networks is to use the Optical Transpose Interconnecting System (OTIS) architecture proposed in [8]. This system allows to optically interconnect some set of processors in a Free Space of Optical Interconnections (FSOI). BrieïŹ‚y, it consists of two lenslet arrays allowing a large number of optical interconnections from a set of transmitters to a set of receivers. Note that the OTIS architecture is indeed a three dimension (3D) one but it can always be modeled in a 2D-space. We ïŹrst try to provide ways of determining if a given general network admits an OTIS layout or not. Then we study particular case of regular and symmetric networks. Finally we show that classical topologies like de Bruijn, Kautz and complete digraphs admit an OTIS2D -layout. At the end, the results obtained for the OTIS2D model are applied to the OTIS3D case

    OTIS-Based Multi-Hop Multi-OPS Lightwave Networks

    Get PDF
    International audienceAdvances in optical technology, such as low loss Optical Passive Star couplers (OPS) and the possibility of building tunable optical transmitters and receivers have increased the interest for multiprocessor architectures based on lightwave networks because of the vast bandwidth available. Many research have been done at both technological and theoretical level. An essential effort has to be done in linking those results. In this paper we propose optical designs for two multi-OPS networks: the single-hop POPS network and the multi-hop stack-Kautz network; using the Optical Transpose Interconnecting System (OTIS) architecture, from the Optoelectronic Computing Group of UCSD. In order to achieve our result, we also provide the optical design of a generalization of the Kautz digraph, using OTIS

    Topologies for Optical Interconnection Networks Based on the Optical Transpose Interconnection System

    Get PDF
    International audienceMany results exist in the literature describing technological and theoretical advances in optical network topologies and design. However, an essential effort has yet to be done in linking those results together. In this paper, we propose a step in this direction, by giving optical layouts for several graph-theoretical topologies studied in the literature, using the Optical Transpose Interconnection System (OTIS) architecture. These topologies include the family of Partitioned Optical Passive Star (POPS) and stack-Kautz networks as well as a generalization of the Kautz and de Bruijn digraphs

    Aspects of practical implementations of PRAM algorithms

    Get PDF
    The PRAM is a shared memory model of parallel computation which abstracts away from inessential engineering details. It provides a very simple architecture independent model and provides a good programming environment. Theoreticians of the computer science community have proved that it is possible to emulate the theoretical PRAM model using current technology. Solutions have been found for effectively interconnecting processing elements, for routing data on these networks and for distributing the data among memory modules without hotspots. This thesis reviews this emulation and the possibilities it provides for large scale general purpose parallel computation. The emulation employs a bridging model which acts as an interface between the actual hardware and the PRAM model. We review the evidence that such a scheme crn achieve scalable parallel performance and portable parallel software and that PRAM algorithms can be optimally implemented on such practical models. In the course of this review we presented the following new results: 1. Concerning parallel approximation algorithms, we describe an NC algorithm for finding an approximation to a minimum weight perfect matching in a complete weighted graph. The algorithm is conceptually very simple and it is also the first NC-approximation algorithm for the task with a sub-linear performance ratio. 2. Concerning graph embedding, we describe dense edge-disjoint embeddings of the complete binary tree with n leaves in the following n-node communication networks: the hypercube, the de Bruijn and shuffle-exchange networks and the 2-dimcnsional mesh. In the embeddings the maximum distance from a leaf to the root of the tree is asymptotically optimally short. The embeddings facilitate efficient implementation of many PRAM algorithms on networks employing these graphs as interconnection networks. 3. Concerning bulk synchronous algorithmics, we describe scalable transportable algorithms for the following three commonly required types of computation; balanced tree computations. Fast Fourier Transforms and matrix multiplications

    Some studies on the multi-mesh architecture.

    Get PDF
    In this thesis, we have reported our investigations on interconnection network architectures based on the idea of a recently proposed multi-processor architecture, Multi-Mesh network. This includes the development of a new interconnection architecture, study of its topological properties and a proposal for implementing Multi-Mesh using optical technology. We have presented a new network topology, called the 3D Multi-Mesh (3D MM) that is an extension of the Multi-Mesh architecture [DDS99]. This network consists of n3 three-dimensional meshes (termed as 3D blocks), each having n3 processors, interconnected in a suitable manner so that the resulting topology is 6-regular with n6 processors and a diameter of only 3n. We have shown that the connectivity of this network is 6. We have explored an algorithm for point-to-point communication on the 3D MM. It is expected that this architecture will enable more efficient algorithm mapping compared to existing architectures. We have also proposed some implementation of the multi-mesh avoiding the electronic bottleneck due to long copper wires for communication between some processors. Our implementation considers a number of realistic scenarios based on hybrid (optical and electronic) communication. One unique feature of this investigation is our use of WDM wavelength routing and the protection scheme. We are not aware of any implementation of interconnection networks using these techniques.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .A32. Source: Masters Abstracts International, Volume: 43-03, page: 0868. Adviser: Subir Bandyopadhyay. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    Simulation Of Multi-core Systems And Interconnections And Evaluation Of Fat-Mesh Networks

    Get PDF
    Simulators are very important in computer architecture research as they enable the exploration of new architectures to obtain detailed performance evaluation without building costly physical hardware. Simulation is even more critical to study future many-core architectures as it provides the opportunity to assess currently non-existing computer systems. In this thesis, a multiprocessor simulator is presented based on a cycle accurate architecture simulator called SESC. The shared L2 cache system is extended into a distributed shared cache (DSC) with a directory-based cache coherency protocol. A mesh network module is extended and integrated into SESC to replace the bus for scalable inter-processor communication. While these efforts complete an extended multiprocessor simulation infrastructure, two interconnection enhancements are proposed and evaluated. A novel non-uniform fat-mesh network structure similar to the idea of fat-tree is proposed. This non-uniform mesh network takes advantage of the average traffic pattern, typically all-to-all in DSC, to dedicate additional links for connections with heavy traffic (e.g., near the center) and fewer links for lighter traffic (e.g., near the periphery). Two fat-mesh schemes are implemented based on different routing algorithms. Analytical fat-mesh models are constructed by presenting the expressions for the traffic requirements of personalized all-to-all traffic. Performance improvements over the uniform mesh are demonstrated in the results from the simulator. A hybrid network consisting of one packet switching plane and multiple circuit switching planes is constructed as the second enhancement. The circuit switching planes provide fast paths between neighbors with heavy communication traffic. A compiler technique that abstracts the symbolic expressions of benchmarks' communication patterns can be used to help facilitate the circuit establishment

    Interconnection networks for parallel and distributed computing

    Get PDF
    Parallel computers are generally either shared-memory machines or distributed- memory machines. There are currently technological limitations on shared-memory architectures and so parallel computers utilizing a large number of processors tend tube distributed-memory machines. We are concerned solely with distributed-memory multiprocessors. In such machines, the dominant factor inhibiting faster global computations is inter-processor communication. Communication is dependent upon the topology of the interconnection network, the routing mechanism, the flow control policy, and the method of switching. We are concerned with issues relating to the topology of the interconnection network. The choice of how we connect processors in a distributed-memory multiprocessor is a fundamental design decision. There are numerous, often conflicting, considerations to bear in mind. However, there does not exist an interconnection network that is optimal on all counts and trade-offs have to be made. A multitude of interconnection networks have been proposed with each of these networks having some good (topological) properties and some not so good. Existing noteworthy networks include trees, fat-trees, meshes, cube-connected cycles, butterflies, Möbius cubes, hypercubes, augmented cubes, k-ary n-cubes, twisted cubes, n-star graphs, (n, k)-star graphs, alternating group graphs, de Bruijn networks, and bubble-sort graphs, to name but a few. We will mainly focus on k-ary n-cubes and (n, k)-star graphs in this thesis. Meanwhile, we propose a new interconnection network called augmented k-ary n- cubes. The following results are given in the thesis.1. Let k ≄ 4 be even and let n ≄ 2. Consider a faulty k-ary n-cube Q(^k_n) in which the number of node faults f(_n) and the number of link faults f(_e) are such that f(_n) + f(_e) ≀ 2n - 2. We prove that given any two healthy nodes s and e of Q(^k_n), there is a path from s to e of length at least k(^n) - 2f(_n) - 1 (resp. k(^n) - 2f(_n) - 2) if the nodes s and e have different (resp. the same) parities (the parity of a node Q(^k_n) in is the sum modulo 2 of the elements in the n-tuple over 0, 1, ∙∙∙ , k - 1 representing the node). Our result is optimal in the sense that there are pairs of nodes and fault configurations for which these bounds cannot be improved, and it answers questions recently posed by Yang, Tan and Hsu, and by Fu. Furthermore, we extend known results, obtained by Kim and Park, for the case when n = 2.2. We give precise solutions to problems posed by Wang, An, Pan, Wang and Qu and by Hsieh, Lin and Huang. In particular, we show that Q(^k_n) is bi-panconnected and edge-bipancyclic, when k ≄ 3 and n ≄ 2, and we also show that when k is odd, Q(^k_n) is m-panconnected, for m = (^n(k - 1) + 2k - 6’ / ‘_2), and (k -1) pancyclic (these bounds are optimal). We introduce a path-shortening technique, called progressive shortening, and strengthen existing results, showing that when paths are formed using progressive shortening then these paths can be efficiently constructed and used to solve a problem relating to the distributed simulation of linear arrays and cycles in a parallel machine whose interconnection network is Q(^k_n) even in the presence of a faulty processor.3. We define an interconnection network AQ(^k_n) which we call the augmented k-ary n-cube by extending a k-ary n-cube in a manner analogous to the existing extension of an n-dimensional hypercube to an n-dimensional augmented cube. We prove that the augmented k-ary n-cube Q(^k_n) has a number of attractive properties (in the context of parallel computing). For example, we show that the augmented k-ary n-cube Q(^k_n) - is a Cayley graph (and so is vertex-symmetric); has connectivity 4n - 2, and is such that we can build a set of 4n - 2 mutually disjoint paths joining any two distinct vertices so that the path of maximal length has length at most max{{n- l)k- (n-2), k + 7}; has diameter [(^k) / (_3)] + [(^k - 1) /( _3)], when n = 2; and has diameter at most (^k) / (_4) (n+ 1), for n ≄ 3 and k even, and at most [(^k)/ (_4) (n + 1) + (^n) / (_4), for n ^, for n ≄ 3 and k odd.4. We present an algorithm which given a source node and a set of n - 1 target nodes in the (n, k)-star graph S(_n,k) where all nodes are distinct, builds a collection of n - 1 node-disjoint paths, one from each target node to the source. The collection of paths output from the algorithm is such that each path has length at most 6k - 7, and the algorithm has time complexity O(k(^3)n(^4))
    corecore