Search CORE

5 research outputs found

Interconnection networks for parallel and distributed computing

Author: Xiang Yonghong
Publication venue
Publication date: 01/01/2008
Field of study

Parallel computers are generally either shared-memory machines or distributed- memory machines. There are currently technological limitations on shared-memory architectures and so parallel computers utilizing a large number of processors tend tube distributed-memory machines. We are concerned solely with distributed-memory multiprocessors. In such machines, the dominant factor inhibiting faster global computations is inter-processor communication. Communication is dependent upon the topology of the interconnection network, the routing mechanism, the flow control policy, and the method of switching. We are concerned with issues relating to the topology of the interconnection network. The choice of how we connect processors in a distributed-memory multiprocessor is a fundamental design decision. There are numerous, often conflicting, considerations to bear in mind. However, there does not exist an interconnection network that is optimal on all counts and trade-offs have to be made. A multitude of interconnection networks have been proposed with each of these networks having some good (topological) properties and some not so good. Existing noteworthy networks include trees, fat-trees, meshes, cube-connected cycles, butterflies, Möbius cubes, hypercubes, augmented cubes, k-ary n-cubes, twisted cubes, n-star graphs, (n, k)-star graphs, alternating group graphs, de Bruijn networks, and bubble-sort graphs, to name but a few. We will mainly focus on k-ary n-cubes and (n, k)-star graphs in this thesis. Meanwhile, we propose a new interconnection network called augmented k-ary n- cubes. The following results are given in the thesis.1. Let k ≥ 4 be even and let n ≥ 2. Consider a faulty k-ary n-cube Q(^k_n) in which the number of node faults f(_n) and the number of link faults f(_e) are such that f(_n) + f(_e) ≤ 2n - 2. We prove that given any two healthy nodes s and e of Q(^k_n), there is a path from s to e of length at least k(^n) - 2f(_n) - 1 (resp. k(^n) - 2f(_n) - 2) if the nodes s and e have different (resp. the same) parities (the parity of a node Q(^k_n) in is the sum modulo 2 of the elements in the n-tuple over 0, 1, ∙∙∙ , k - 1 representing the node). Our result is optimal in the sense that there are pairs of nodes and fault configurations for which these bounds cannot be improved, and it answers questions recently posed by Yang, Tan and Hsu, and by Fu. Furthermore, we extend known results, obtained by Kim and Park, for the case when n = 2.2. We give precise solutions to problems posed by Wang, An, Pan, Wang and Qu and by Hsieh, Lin and Huang. In particular, we show that Q(^k_n) is bi-panconnected and edge-bipancyclic, when k ≥ 3 and n ≥ 2, and we also show that when k is odd, Q(^k_n) is m-panconnected, for m = (^n(k - 1) + 2k - 6’ / ‘_2), and (k -1) pancyclic (these bounds are optimal). We introduce a path-shortening technique, called progressive shortening, and strengthen existing results, showing that when paths are formed using progressive shortening then these paths can be efficiently constructed and used to solve a problem relating to the distributed simulation of linear arrays and cycles in a parallel machine whose interconnection network is Q(^k_n) even in the presence of a faulty processor.3. We define an interconnection network AQ(^k_n) which we call the augmented k-ary n-cube by extending a k-ary n-cube in a manner analogous to the existing extension of an n-dimensional hypercube to an n-dimensional augmented cube. We prove that the augmented k-ary n-cube Q(^k_n) has a number of attractive properties (in the context of parallel computing). For example, we show that the augmented k-ary n-cube Q(^k_n) - is a Cayley graph (and so is vertex-symmetric); has connectivity 4n - 2, and is such that we can build a set of 4n - 2 mutually disjoint paths joining any two distinct vertices so that the path of maximal length has length at most max{{n- l)k- (n-2), k + 7}; has diameter [(^k) / (_3)] + [(^k - 1) /( _3)], when n = 2; and has diameter at most (^k) / (_4) (n+ 1), for n ≥ 3 and k even, and at most [(^k)/ (_4) (n + 1) + (^n) / (_4), for n ^, for n ≥ 3 and k odd.4. We present an algorithm which given a source node and a set of n - 1 target nodes in the (n, k)-star graph S(_n,k) where all nodes are distinct, builds a collection of n - 1 node-disjoint paths, one from each target node to the source. The collection of paths output from the algorithm is such that each path has length at most 6k - 7, and the algorithm has time complexity O(k(^3)n(^4))

Durham e-Theses

Quarc: an architecture for efficient on-chip communication

Author: Moadeli Mahmoud
Publication venue
Publication date: 01/01/2010
Field of study

The exponential downscaling of the feature size has enforced a paradigm shift from computation-based design to communication-based design in system on chip development. Buses, the traditional communication architecture in systems on chip, are incapable of addressing the increasing bandwidth requirements of future large systems. Networks on chip have emerged as an interconnection architecture offering unique solutions to the technological and design issues related to communication in future systems on chip. The transition from buses as a shared medium to networks on chip as a segmented medium has given rise to new challenges in system on chip realm. By leveraging the shared nature of the communication medium, buses have been highly efficient in delivering multicast communication. The segmented nature of networks, however, inhibits the multicast messages to be delivered as efficiently by networks on chip. Relying on extensive research on multicast communication in parallel computers, several network on chip architectures have offered mechanisms to perform the operation, while conforming to resource constraints of the network on chip paradigm. Multicast communication in majority of these networks on chip is implemented by establishing a connection between source and all multicast destinations before the message transmission commences. Establishing the connections incurs an overhead and, therefore, is not desirable; in particular in latency sensitive services such as cache coherence. To address high performance multicast communication, this research presents Quarc, a novel network on chip architecture. The Quarc architecture targets an area-efficient, low power, high performance implementation. The thesis covers a detailed representation of the building blocks of the architecture, including topology, router and network interface. The cost and performance comparison of the Quarc architecture against other network on chip architectures reveals that the Quarc architecture is a highly efficient architecture. Moreover, the thesis introduces novel performance models of complex traffic patterns, including multicast and quality of service-aware communication

Glasgow Theses Service

CiteSeerX

OpenGrey Repository

Problems Related to Classical and Universal List Broadcasting

Author: GholamiNajarkola MohammadSaber
Publication venue
Publication date: 09/11/2022
Field of study

Broadcasting is a fundamental problem in the information dissemination area. In classical broadcasting, a message must be sent from one network member to all other members as rapidly as feasible. Although it has been demonstrated that this problem is NP-Hard for arbitrary graphs, it has several applications in various fields. As a result, the universal lists model, replicating real-world restrictions like the memory limits of nodes in large networks, is introduced as a branch of this problem in the literature. In the universal lists model, each node is equipped with a fixed list and has to follow the list regardless of the originator. In this study, we focus on both classical and universal lists broadcasting. Classical broadcasting is solvable for a few families of networks, such as trees, unicyclic graphs, tree of cycles, and tree of cliques. In this study, we begin by presenting an optimal algorithm that finds the broadcast time of any vertex in a Fully Connected Tree (FCT_n) in O(|V | log log n) time. An FCT_n is formed by attaching arbitrary trees to vertices of a complete graph of size n where |V| is the total number of vertices in the graph. Then, we replace the complete graph with a Hypercube H_k and propose a new heuristic for the Hypercube of Trees (HT_k). Not only does this heuristic have the same approximation ratio as the best-known algorithm, but our numerical results also show its superiority in most experiments. Our heuristic is able to outperform the current upper bound in up to 90% of the situations, resulting in an average speedup of 30%. Most importantly, our results illustrate that it can maintain its performance even if the network size grows, making the proposed heuristic practically useful. Afterward, we focus on broadcasting with universal lists, in which once a vertex is informed, it must follow its corresponding list, regardless of the originator and the neighbor from which it received the message. The problem of broadcasting with universal lists could be categorized into two sub-models: non-adaptive and adaptive. In the latter model, a sender will skip the vertices on its list from which it has received the message, while those vertices will not be skipped in the first model. In this study, we will present another sub-model called fully adaptive. Not only does this model benefit from a significantly better space complexity compared to the classical model, but, as will be proved, it is faster than the two other sub-models. Since the suggested model fits real-world network architectures, we will design optimal broadcast algorithms for well-known interconnection networks such as trees, grids, and cube-connected cycles. We also present an upper bound for tori under the same model. Then we focus on designing broadcast graphs (bg)’s under this model. A bg is a graph with minimum possible broadcast time from any originator. Additionally, a minimum broadcast graph (mbg) is a bg with the minimum possible number of edges. We propose mbg’s on n vertices for n ≤ 10 and sparse bg’s for 11 ≤ n ≤ 14 under the fully-adaptive model. Afterward, we introduce the first infinite families of bg’s under this model, and we prove that hypercubes are mbg under this model. Later, we establish the optimal broadcast time of k−ary trees and binomial trees under the nonadaptive model and provide an upper bound for complete bipartite graphs. We also improved a general upper bound for trees under the same model. We then suggest several general upper bounds for the universal lists by comparing them with the messy broadcasting model. Finally, we propose the first heuristic for this problem, namely HUB-GA: a Heuristic for Universal lists Broadcasting with Genetic Algorithm. We undertake various numerical experiments on frequently used interconnection networks in the literature, graphs with clique-like structures, and synthetic instances in order to cover many possibilities of industrial topologies. We also compare our results with state-of-the-art methods for classical broadcasting, which is proved to be the fastest model among all. Although the universal list model utilizes less memory than the classical model, our algorithm finds the same broadcast time as the classical model in diverse situations

Concordia University Research Repository

Architectures for a space-based information network with shared on-orbit processing

Author: Chan Serena, 1977-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2005
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Engineering Systems Division, 2005.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 335-343).This dissertation provides a top level assessment of technology design choices for the architecture of a space-based information network with shared on-orbit processing. Networking is an efficient method of sharing communications and lowering the cost of communications, providing better interoperability and data integration for multiple satellites. The current space communications architecture sets a critical limitation on the collection of raw data sent to the ground. By introducing powerful space-borne processing, compression of raw data can alleviate the need for expensive and expansive downlinks. Moreover, distribution of processed data directly from space sensors to the end-users may be more easily realized. A space-based information network backbone can act as the transport network for mission satellites as well as enable the concept of decoupled, shared, and perhaps distributed space-borne processing for space-based assets. Optical crosslinks are the enabling technology for creating a cost-effective network capable of supporting high data rates. In this dissertation, the space-based network backbone is designed to meet a number of mission requirements by optimizing over constellation topologies under different traffic models. With high network capacity availability, space-borne processing can be accessible by any mission satellite attached to the network. Space-borne processing capabilities can be enhanced with commercial processors that are tolerant of radiation and replenished periodically (as frequently as every two years).(cont.) Additionally, innovative ways of using a space-based information network can revolutionize satellite communications and space missions. Applications include distributed computing in space, interoperable space communications, multiplatform distributed satellite communications, coherent distributed space sensing, multisensor data fusion, and restoration of disconnected global terrestrial networks after a disaster. Lastly, the consolidation of all the different communications assets into a horizontally integrated space-based network infrastructure calls for a space-based network backbone to be designed with a generic nature. A coherent infrastructure can satisfy the goals of interoperability, flexibility, scalability, and allows the system to be evolutionary. This transformational vision of a generic space-based information network allows for growth to accommodate civilian demands, lowers the price of entry for the commercial sector, and makes way for innovation to enhance and provide additional value to military systems.by Serena Chan.Ph.D

DSpace@MIT

Proceedings of the 3rd International Workshop on Optimal Networks Topologies IWONT 2010

Author
Publication venue: 'Iniciativa Digital Politecnica'
Publication date: 01/02/2011
Field of study

Peer Reviewe

UPCommons. Portal del coneixement obert de la UPC