13 research outputs found

    Embedding a complete binary tree into a faulty supercube

    Get PDF
    [[abstract]]The supercube is a novel interconnection network that is derived from the hypercube. Unlike the hypercube, the supercube can be constructed for any number of nodes. That is, the supercube is incrementally expandable. In addition, the supercube retains the connectivity and diameter properties of the corresponding hypercube. In this paper, we consider the problem of embedding and reconfiguring binary tree structures in a faulty supercube. Further more, for finding the replaceable node of the faulty node, we allow 2-expansion such that we can show that up to (n-2) faults can be tolerated with congestion 1 and dilation 4 that is (n-1) is the dimension of a supercube[[notice]]補正完畢[[conferencetype]]國際[[conferencedate]]19971210~19971212[[iscallforpapers]]Y[[conferencelocation]]Melbourne, Australi

    Simulation of Meshes in a Faulty Supercube with Unbounded Expansion

    Get PDF
    [[abstract]]Reconfiguring meshes in a faulty Supercube is investigated in the paper. The result can readily be used in the optimal embedding of a mesh (or a torus) of processors in a faulty Supercube with unbounded expansion. There are embedding algorithms proposed in this paper. These embedding algorithms show a mesh with any number of nodes can be embedded into a faulty Supercube with load 1, congestion 1, and dilation 3 such that O(n2-w2) faults can be tolerated, where n is the dimension of the Supercube and 2w is the number of nodes of the mesh. The meshes and hypercubes are widely used interconnection architectures in parallel computing, grid computing, sensor network, and cloud computing. In addition, the Supercubes are superior to hypercube in terms of embedding a mesh and torus under faults. Therefore, we can easily port the parallel or distributed algorithms developed for these structuring of mesh and torus to the Supercube.[[notice]]補正完畢[[journaltype]]國外[[incitationindex]]EI[[ispeerreviewed]]Y[[booktype]]紙本[[countrycodes]]KO

    Diamond-based models for scientific visualization

    Get PDF
    Hierarchical spatial decompositions are a basic modeling tool in a variety of application domains including scientific visualization, finite element analysis and shape modeling and analysis. A popular class of such approaches is based on the regular simplex bisection operator, which bisects simplices (e.g. line segments, triangles, tetrahedra) along the midpoint of a predetermined edge. Regular simplex bisection produces adaptive simplicial meshes of high geometric quality, while simplifying the extraction of crack-free, or conforming, approximations to the original dataset. Efficient multiresolution representations for such models have been achieved in 2D and 3D by clustering sets of simplices sharing the same bisection edge into structures called diamonds. In this thesis, we introduce several diamond-based approaches for scientific visualization. We first formalize the notion of diamonds in arbitrary dimensions in terms of two related simplicial decompositions of hypercubes. This enables us to enumerate the vertices, simplices, parents and children of a diamond. In particular, we identify the number of simplices involved in conforming updates to be factorial in the dimension and group these into a linear number of subclusters of simplices that are generated simultaneously. The latter form the basis for a compact pointerless representation for conforming meshes generated by regular simplex bisection and for efficiently navigating the topological connectivity of these meshes. Secondly, we introduce the supercube as a high-level primitive on such nested meshes based on the atomic units within the underlying triangulation grid. We propose the use of supercubes to associate information with coherent subsets of the full hierarchy and demonstrate the effectiveness of such a representation for modeling multiresolution terrain and volumetric datasets. Next, we introduce Isodiamond Hierarchies, a general framework for spatial access structures on a hierarchy of diamonds that exploits the implicit hierarchical and geometric relationships of the diamond model. We use an isodiamond hierarchy to encode irregular updates to a multiresolution isosurface or interval volume in terms of regular updates to diamonds. Finally, we consider nested hypercubic meshes, such as quadtrees, octrees and their higher dimensional analogues, through the lens of diamond hierarchies. This allows us to determine the relationships involved in generating balanced hypercubic meshes and to propose a compact pointerless representation of such meshes. We also provide a local diamond-based triangulation algorithm to generate high-quality conforming simplicial meshes

    On the Effect of Quantum Interaction Distance on Quantum Addition Circuits

    Full text link
    We investigate the theoretical limits of the effect of the quantum interaction distance on the speed of exact quantum addition circuits. For this study, we exploit graph embedding for quantum circuit analysis. We study a logical mapping of qubits and gates of any Ω(logn)\Omega(\log n)-depth quantum adder circuit for two nn-qubit registers onto a practical architecture, which limits interaction distance to the nearest neighbors only and supports only one- and two-qubit logical gates. Unfortunately, on the chosen kk-dimensional practical architecture, we prove that the depth lower bound of any exact quantum addition circuits is no longer Ω(logn)\Omega(\log {n}), but Ω(nk)\Omega(\sqrt[k]{n}). This result, the first application of graph embedding to quantum circuits and devices, provides a new tool for compiler development, emphasizes the impact of quantum computer architecture on performance, and acts as a cautionary note when evaluating the time performance of quantum algorithms.Comment: accepted for ACM Journal on Emerging Technologies in Computing System

    Supercubes: A High-Level Primitive for Diamond Hierarchies

    Full text link

    The Design of Cube Calculus Machine Using Sram-Based Fpga Reconfigurable Hardware Dec’s Perle-1 Board

    Get PDF
    Presented in this thesis are new approaches to column compatibility checking and column-based input/output encoding for Curtis decompositions of switching functions. These approaches can be used in Curtis-type functional decomposition programs for applications in several scientific disciplines. Examples of applications are: minimization of combinational and sequential logic) mapping of logic functions to programmable logic devices such as CPLDs, MPGAs, and FPGAs, data encryption, data compression, pattern recognition) and image refinement. Presently, Curtis-type functional decomposition programs are used primarily for experimental purposes due to performance, quality, and compatibility issues. However) in the past few years a renewal of interest in the area of functional decomposition has resulted in significant improvements in performance and quality of multi-level decomposition programs. The goal of this thesis is to introduce algorithms that can significantly improve the performance and quality of Curtis-type decomposition programs. In doing so, it is hoped that a Curtis-type decomposition program, complete with efficient, high quality algorithms for decomposition, will be a feasible tool for use in one or more practical applications. Various testing and analyses were performed in order to evaluate the potential of algorithms presented in this thesis for use in a high quality Curtis-type decomposition program. Testing was done using a binary input, binary output Curtis-type decomposition program MULTIS/GUD. This program was implemented here at Portland State University by the Portland Oregon Logic Optimization Group

    Processor allocation strategies for modified hypercubes

    Get PDF
    Parallel processing has been widely accepted to be the future in high speed computing. Among the various parallel architectures proposed/implemented, the hypercube has shown a lot of promise because of its poweful properties, like regular topology, fault tolerance, low diameter, simple routing, and ability to efficiently emulate other architectures. The major drawback of the hypercube network is that it can not be expanded in practice because the number of communication ports for each processor grows as the logarithm of the total number of processors in the system. Therefore, once a hypercube supercomputer of a certain dimensionality has been built, any future expansions can be accomplished only by replacing the VLSI chips. This is an undesirable feature and a lot of work has been under progress to eliminate this stymie, thus providing a platform for easier expansion. Modified hypercubes (MHs) have been proposed as the building blocks of hypercube-based systems supporting incremental growth techniques without introducing extra resources for individual hypercubes. However, processor allocation on MHs proves to be a challenge due to a slight deviation in their topology from that of the standard hypercube network. This thesis addresses the issue of processor allocation on MHs and proposes various strategies which are based, partially or entirely, on table look-up approaches. A study of the various task allocation strategies for standard hypercubes is conducted and their suitability for MHs is evaluated. It is shown that the proposed strategies have a perfect subcube recognition ability and a superior performance. Existing processor allocation strategies for pure hypercube networks are demonstrated to be ineffective for MHs, in the light of their inability to recognize all available subcubes. A comparative analysis that involves the buddy strategy and the new strategies is carried out using simulation results

    Optimal Embeddings of Paths with Various Lengths in Twisted Cubes

    Full text link

    Interconnection networks for parallel and distributed computing

    Get PDF
    Parallel computers are generally either shared-memory machines or distributed- memory machines. There are currently technological limitations on shared-memory architectures and so parallel computers utilizing a large number of processors tend tube distributed-memory machines. We are concerned solely with distributed-memory multiprocessors. In such machines, the dominant factor inhibiting faster global computations is inter-processor communication. Communication is dependent upon the topology of the interconnection network, the routing mechanism, the flow control policy, and the method of switching. We are concerned with issues relating to the topology of the interconnection network. The choice of how we connect processors in a distributed-memory multiprocessor is a fundamental design decision. There are numerous, often conflicting, considerations to bear in mind. However, there does not exist an interconnection network that is optimal on all counts and trade-offs have to be made. A multitude of interconnection networks have been proposed with each of these networks having some good (topological) properties and some not so good. Existing noteworthy networks include trees, fat-trees, meshes, cube-connected cycles, butterflies, Möbius cubes, hypercubes, augmented cubes, k-ary n-cubes, twisted cubes, n-star graphs, (n, k)-star graphs, alternating group graphs, de Bruijn networks, and bubble-sort graphs, to name but a few. We will mainly focus on k-ary n-cubes and (n, k)-star graphs in this thesis. Meanwhile, we propose a new interconnection network called augmented k-ary n- cubes. The following results are given in the thesis.1. Let k ≥ 4 be even and let n ≥ 2. Consider a faulty k-ary n-cube Q(^k_n) in which the number of node faults f(_n) and the number of link faults f(_e) are such that f(_n) + f(_e) ≤ 2n - 2. We prove that given any two healthy nodes s and e of Q(^k_n), there is a path from s to e of length at least k(^n) - 2f(_n) - 1 (resp. k(^n) - 2f(_n) - 2) if the nodes s and e have different (resp. the same) parities (the parity of a node Q(^k_n) in is the sum modulo 2 of the elements in the n-tuple over 0, 1, ∙∙∙ , k - 1 representing the node). Our result is optimal in the sense that there are pairs of nodes and fault configurations for which these bounds cannot be improved, and it answers questions recently posed by Yang, Tan and Hsu, and by Fu. Furthermore, we extend known results, obtained by Kim and Park, for the case when n = 2.2. We give precise solutions to problems posed by Wang, An, Pan, Wang and Qu and by Hsieh, Lin and Huang. In particular, we show that Q(^k_n) is bi-panconnected and edge-bipancyclic, when k ≥ 3 and n ≥ 2, and we also show that when k is odd, Q(^k_n) is m-panconnected, for m = (^n(k - 1) + 2k - 6’ / ‘_2), and (k -1) pancyclic (these bounds are optimal). We introduce a path-shortening technique, called progressive shortening, and strengthen existing results, showing that when paths are formed using progressive shortening then these paths can be efficiently constructed and used to solve a problem relating to the distributed simulation of linear arrays and cycles in a parallel machine whose interconnection network is Q(^k_n) even in the presence of a faulty processor.3. We define an interconnection network AQ(^k_n) which we call the augmented k-ary n-cube by extending a k-ary n-cube in a manner analogous to the existing extension of an n-dimensional hypercube to an n-dimensional augmented cube. We prove that the augmented k-ary n-cube Q(^k_n) has a number of attractive properties (in the context of parallel computing). For example, we show that the augmented k-ary n-cube Q(^k_n) - is a Cayley graph (and so is vertex-symmetric); has connectivity 4n - 2, and is such that we can build a set of 4n - 2 mutually disjoint paths joining any two distinct vertices so that the path of maximal length has length at most max{{n- l)k- (n-2), k + 7}; has diameter [(^k) / (_3)] + [(^k - 1) /( _3)], when n = 2; and has diameter at most (^k) / (_4) (n+ 1), for n ≥ 3 and k even, and at most [(^k)/ (_4) (n + 1) + (^n) / (_4), for n ^, for n ≥ 3 and k odd.4. We present an algorithm which given a source node and a set of n - 1 target nodes in the (n, k)-star graph S(_n,k) where all nodes are distinct, builds a collection of n - 1 node-disjoint paths, one from each target node to the source. The collection of paths output from the algorithm is such that each path has length at most 6k - 7, and the algorithm has time complexity O(k(^3)n(^4))
    corecore