1,099 research outputs found

    On the Area of Hypercube Layouts

    Get PDF
    This paper precisely analyzes the wire density and required area in standard layout styles for the hypercube. The most natural, regular layout of a hypercube of N^2 nodes in the plane, in a N x N grid arrangement, uses floor(2N/3)+1 horizontal wiring tracks for each row of nodes. (The number of tracks per row can be reduced by 1 with a less regular design.) This paper also gives a simple formula for the wire density at any cut position and a full characterization of all places where the wire density is maximized (which does not occur at the bisection).Comment: 8 pages, 4 figures, LaTe

    Omniscopes: Large Area Telescope Arrays with only N log N Computational Cost

    Get PDF
    We show that the class of antenna layouts for telescope arrays allowing cheap analysis hardware (with correlator cost scaling as N log N rather than N^2 with the number of antennas N) is encouragingly large, including not only previously discussed rectangular grids but also arbitrary hierarchies of such grids, with arbitrary rotations and shears at each level. We show that all correlations for such a 2D array with an n-level hierarchy can be efficiently computed via a Fast Fourier Transform in not 2 but 2n dimensions. This can allow major correlator cost reductions for science applications requiring exquisite sensitivity at widely separated angular scales, for example 21cm tomography (where short baselines are needed to probe the cosmological signal and long baselines are needed for point source removal), helping enable future 21cm experiments with thousands or millions of cheap dipole-like antennas. Such hierarchical grids combine the angular resolution advantage of traditional array layouts with the cost advantage of a rectangular Fast Fourier Transform Telescope. We also describe an algorithm for how a subclass of hierarchical arrays can efficiently use rotation synthesis to produce global sky maps with minimal noise and a well-characterized synthesized beam.Comment: Replaced to match accepted PRD version. 10 pages, 9 fig

    Fault-tolerant meshes and hypercubes with minimal numbers of spares

    Get PDF
    Many parallel computers consist of processors connected in the form of a d-dimensional mesh or hypercube. Two- and three-dimensional meshes have been shown to be efficient in manipulating images and dense matrices, whereas hypercubes have been shown to be well suited to divide-and-conquer algorithms requiring global communication. However, even a single faulty processor or communication link can seriously affect the performance of these machines. This paper presents several techniques for tolerating faults in d-dimensional mesh and hypercube architectures. Our approach consists of adding spare processors and communication links so that the resulting architecture will contain a fault-free mesh or hypercube in the presence of faults. We optimize the cost of the fault-tolerant architecture by adding exactly k spare processors (while tolerating up to k processor and/or link faults) and minimizing the maximum number of links per processor. For example, when the desired architecture is a d-dimensional mesh and k = 1, we present a fault-tolerant architecture that has the same maximum degree as the desired architecture (namely, 2d) and has only one spare processor. We also present efficient layouts for fault-tolerant two- and three-dimensional meshes, and show how multiplexers and buses can be used to reduce the degree of fault-tolerant architectures. Finally, we give constructions for fault-tolerant tori, eight-connected meshes, and hexagonal meshes

    Tighter layouts of the cube-connected cycles

    Get PDF
    Preparata and Vuillemin proposed the cube-connected cycles (CCC) and its compact layout in 1981 [17]. We give a new layout of the CCC which uses less than half the area of the Preparata-Vuillemin layout. We also give a lower bound on the layout area of the CCC. The area of the new layout deviates from this bound by a small constant factor. If we 'unfold' the cycles in the CCC, the resulting structure can be laid out in optimal area.published_or_final_versio

    A tight layout of the cube-connected cycles

    Get PDF
    Preparata and Vuillemin proposed the cubeconnected cycles (CCC) in 1981 [lS], and in the same paper, gave an asymptotically-optimal layout scheme for the CCC. We give a new layout scheme for the CCC which requires less than half of the area of th,e Preparata- Vuillemin layout. We also give a non-trivial lower bound on the layout area of the CCC. There is a constant factor of 2 between the new layout and the lower bound. We conjectur.e that the new layout is optimal (minimal).published_or_final_versio

    A tight layout of the cube-connected cycles

    Get PDF
    Preparata and Vuillemin proposed the cubeconnected cycles (CCC) in 1981 [lS], and in the same paper, gave an asymptotically-optimal layout scheme for the CCC. We give a new layout scheme for the CCC which requires less than half of the area of th,e Preparata- Vuillemin layout. We also give a non-trivial lower bound on the layout area of the CCC. There is a constant factor of 2 between the new layout and the lower bound. We conjectur.e that the new layout is optimal (minimal).published_or_final_versio

    On crossing numbers of hypercubes and cube connected cycles

    Get PDF
    Recently the hypercube-like networks have received considerable attention in the field of parallel computing due to its high potential for system availability and parallel execution of algorithms. The crossing number cr(G){\rm cr}(G) of a graph GG is defined as the least number of crossings of its edges when GG is drawn in a plane. Crossing numbers naturally appear in the fabrication of VLSI circuit and provide a good area lower bound argument in VLSI complexity theory. According to the survey paper of Harary et al., all that is known on the exact values of an n-dimensional hypercube cr(Qn){\rm cr}(Q_n) is cr(Q3)=0,cr(Q4)=8{\rm cr}(Q_3)=0, {\rm cr}(Q_4)=8 and cr(Q5)56.{\rm cr}(Q_5)\le 56. We prove the following tight bounds on cr(Qn){\rm cr}(Q_n) and cr(CCCn){\rm cr}(CCC_n): 4n20(n+1)2n2<cr(Qn)<4n6n22n3 \frac{4^n}{20} - (n+1)2^{n-2} < {\rm cr}(Q_n) < \frac{4^n}{6} -n^22^{n-3} 4n203(n+1)2n2<cr(CCCn)<4n6+3n22n3. \frac{4^n}{20} - 3(n+1)2^{n-2} < {\rm cr}(CCC_n) < \frac{4^n}{6} + 3n^22^{n-3}. Our lower bounds on cr(Qn){\rm cr}(Q_n) and cr(CCCn){\rm cr}(CCC_n) give immediately alternative proofs that the area complexity of {\it hypercube} and CCCCCC computers realized on VLSI circuits is $A=\Omega (4^n)

    A Comprehensive Study of Module Layouts for Silicon Solar Cells Under Partial Shading

    Get PDF
    Integrated applications for solar energy production becomes increasingly important. The electrification of car bodies and building facades are only two prominent examples. In such applications shading becomes a challenging problem, since the classic serial interconnection of solar cells in terms of power output is highly vulnerable to partial shading. In this article, we investigate the three most common module layouts in the market (conventional, butterfly, and shingle string) and add a fourth layout (shingle matrix) to be introduced to the market in the future. We discuss an approach to cluster shadings occurring in urban surroundings into basic shapes like “rectangular” and “random”. Choosing a Monte Carlo technique in combination with latin hypercube sampling (LHS), we consider more than 3000 scenarios in total. For the evaluation of the scenarios, we conduct circuit simulations using LTspice. Furthermore, we define a normalization base, which considers only partial shading as a quantitative baseline for comparison. Our results show, that already for 200–400 scenarios the obtained output values stabilize. Among the investigated module layouts, the shingle matrix interconnection achieves the highest score, followed by a shingle string, half-cell butterfly and the conventional full-cell layout

    A parametrized sorting System for a large set of k-bit elements

    Get PDF
    In this paper, we describe a parametrized sorting system for a large set of k-bit elements. The structure of the system is independent from the problem size (the number of elements to be sorted) and the type of the sorting set (for example, a set of k-bit numbers, an alphabetical list of k-bit words etc.), as well as from the ordering relation defined on the set of the elements (such as ascending or descending order of k-bit numbers, or a specific order of alphabetical words). The general structure of the underlying parallel network is based on the n- dimensional hypercube. The node circuit construction defines the type of the sorting elements, thus defining the semantics of the system. The structure of the circuit implements the Columnsort algorithm introduced by Leighton in [Lei85]. By changing only one subcircuit of the size O(k) in the node, we can define different ordering relations of the sorted elements. The system is based on specific VLSI chips that were developed in [Gam96] with the CAD system Cadic [Bur95], that has been developed in the project B1 "VLSI design systems and parallelity" under guidance of Prof. G. Hotz. The result is a fast system that sorts the sets of up to 2^28 64-bit numbers. The maximal sorting time is less than 43.6 seconds that is better than some of the fastest software realizations implemented at 32-processor Paragon ([Hard96]), Cray Y-MP ([ZagBlel91]) and MasPar MP-1 ([BrockWan97])
    corecore