10 research outputs found

    Parallel Computation on Hypercube-Like Machines.

    Get PDF
    The hypercube interconnection network has been recognized to be very suitable for a parallel computing architecture due to its attractive topological properties. Recently, several modified hypercubes have been propose to improve the performance of a hypercube. This dissertation deals with two modified hypercubes, the X-hypercube and the Z-cube. The X-hypercube is a variant of the hypercube, with the same amount of hardware but a diameter of only ⌈\lceil(n + 1)/2⌉\rceil in a hypercube of dimension n. The Z-cube has only 75 percent of the edges of a hypercube with the same number vertices and the same diameter as the hypercube. In this dissertation, we investigate some topological properties and the effectiveness of the X-hypercube and the Z-cube in their combinatorial and computational aspects. We give the optimal or nearly optimal data communication algorithms including routing, broadcasting, and census function for the X-hypercube and the Z-cube. We also give the optimal embedding algorithms between the X-hypercube and the hypercube. It is shown that the average distance between vertices in a X-hypercube is roughly 13/16 of that in a hypercube. This implies that a X-hypercube achieves the better average communication performance than a hypercube. In addition, a set of fundamental SIMD algorithms for a X-hypercube is given. Our results indicate that the X-hypercube makes an improvement in performance over the hypercube, but not as much as the reduction in a diameter, and the Z-cube is a good alternative for the hypercube as far as the VLSI implementation is of major concern

    Interconnection Networks Embeddings and Efficient Parallel Computations.

    Get PDF
    To obtain a greater performance, many processors are allowed to cooperate to solve a single problem. These processors communicate via an interconnection network or a bus. The most essential function of the underlying interconnection network is the efficient interchanging of messages between processes in different processors. Parallel machines based on the hypercube topology have gained a great respect in parallel computation because of its many attractive properties. Many versions of the hypercube have been introduced by many researchers mainly to enhance communications. The twisted hypercube is one of the most attractive versions of the hypercube. It preserves the important features of the hypercube and reduces its diameter by a factor of two. This dissertation investigates relations and transformations between various interconnection networks and the twisted hypercube and explore its efficiency in parallel computation. The capability of the twisted hypercube to simulate complete binary trees, complete quad trees, and rings is demonstrated and compared with the hypercube. Finally, the fault-tolerance of the twisted hypercube is investigated. We present optimal algorithms to simulate rings in a faulty twisted hypercube environment and compare that with the hypercube

    Properties and algorithms of the hyper-star graph and its related graphs

    Get PDF
    The hyper-star interconnection network was proposed in 2002 to overcome the drawbacks of the hypercube and its variations concerning the network cost, which is defined by the product of the degree and the diameter. Some properties of the graph such as connectivity, symmetry properties, embedding properties have been studied by other researchers, routing and broadcasting algorithms have also been designed. This thesis studies the hyper-star graph from both the topological and algorithmic point of view. For the topological properties, we try to establish relationships between hyper-star graphs with other known graphs. We also give a formal equation for the surface area of the graph. Another topological property we are interested in is the Hamiltonicity problem of this graph. For the algorithms, we design an all-port broadcasting algorithm and a single-port neighbourhood broadcasting algorithm for the regular form of the hyper-star graphs. These algorithms are both optimal time-wise. Furthermore, we prove that the folded hyper-star, a variation of the hyper-star, to be maixmally fault-tolerant

    The connection machine

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1988.Bibliography: leaves 134-157.by William Daniel Hillis.Ph.D

    Small-world interconnection networks for large parallel computer systems

    Get PDF
    The use of small-world graphs as interconnection networks of multicomputers is proposed and analysed in this work. Small-world interconnection networks are constructed by adding (or modifying) edges to an underlying local graph. Graphs with a rich local structure but with a large diameter are shown to be the most suitable candidates for the underlying graph. Generation models based on random and deterministic wiring processes are proposed and analysed. For the random case basic properties such as degree, diameter, average length and bisection width are analysed, and the results show that a fast transition from a large diameter to a small diameter is experienced when the number of new edges introduced is increased. Random traffic analysis on these networks is undertaken, and it is shown that although the average latency experiences a similar reduction, networks with a small number of shortcuts have a tendency to saturate as most of the traffic flows through a small number of links. An analysis of the congestion of the networks corroborates this result and provides away of estimating the minimum number of shortcuts required to avoid saturation. To overcome these problems deterministic wiring is proposed and analysed. A Linear Feedback Shift Register is used to introduce shortcuts in the LFSR graphs. A simple routing algorithm has been constructed for the LFSR and extended with a greedy local optimisation technique. It has been shown that a small search depth gives good results and is less costly to implement than a full shortest path algorithm. The Hilbert graph on the other hand provides some additional characteristics, such as support for incremental expansion, efficient layout in two dimensional space (using two layers), and a small fixed degree of four. Small-world hypergraphs have also been studied. In particular incomplete hypermeshes have been introduced and analysed and it has been shown that they outperform the complete traditional implementations under a constant pinout argument. Since it has been shown that complete hypermeshes outperform the mesh, the torus, low dimensional m-ary d-cubes (with and without bypass channels), and multi-stage interconnection networks (when realistic decision times are accounted for and with a constant pinout), it follows that incomplete hypermeshes outperform them as well

    SNARKs for C: Verifying Program Executions Succinctly and in Zero Knowledge

    Get PDF
    An argument system for NP is a proof system that allows efficient verification of NP statements, given proofs produced by an untrusted yet computationally-bounded prover. Such a system is non-interactive and publicly-verifiable if, after a trusted party publishes a proving key and a verification key, anyone can use the proving key to generate non-interactive proofs for adaptively-chosen NP statements, and proofs can be verified by anyone by using the verification key. We present an implementation of a publicly-verifiable non-interactive argument system for NP. The system, moreover, is a zero-knowledge proof-of-knowledge. It directly proves correct executions of programs on TinyRAM, a random-access machine tailored for efficient verification of nondeterministic computations. Given a program PP and time bound T, the system allows for proving correct execution of PP, on any input xx, for up to T steps, after a one-time setup requiring O~(∣P∣T)\tilde{O}(|P| T) cryptographic operations. An honest prover requires O~(∣P∣⋅T)\tilde{O}(|P| \cdot T) cryptographic operations to generate such a proof, while proof verification can be performed with only O(∣x∣)O(|x|) cryptographic operations. This system can be used to prove the correct execution of C programs, using our TinyRAM port of the GCC compiler. This yields a zero-knowledge Succinct Non-interactive ARgument of Knowledge (zk-SNARK) for program executions in the preprocessing model -- a powerful solution for delegating NP computations, with several features not achieved by previously-implemented primitives. Our approach builds on recent theoretical progress in the area. We present efficiency improvements and implementations of two main ingredients: * Given a C program, we produce a circuit whose satisfiability encodes the correctness of execution of the program. Leveraging nondeterminism, the generated circuit\u27s size is merely quasilinear in the size of the computation. In particular, we efficiently handle arbitrary and data-dependent loops, control flow, and memory accesses. This is in contrast with existing ``circuit generators\u27\u27, which in the general case produce circuits of quadratic size. * Given a linear PCP for verifying satisfiability of circuits, we produce a corresponding SNARK. We construct such a linear PCP (which, moreover, is zero-knowledge and very efficient) by building on and improving on recent work on quadratic arithmetic programs

    Real-time sound synthesis on a multi-processor platform

    Get PDF
    Real-time sound synthesis means that the calculation and output of each sound sample for a channel of audio information must be completed within a sample period. At a broadcasting standard, a sampling rate of 32,000 Hz, the maximum period available is 31.25 μsec. Such requirements demand a large amount of data processing power. An effective solution for this problem is a multi-processor platform; a parallel and distributed processing system. The suitability of the MIDI [Music Instrument Digital Interface] standard, published in 1983, as a controller for real-time applications is examined. Many musicians have expressed doubts on the decade old standard's ability for real-time performance. These have been investigated by measuring timing in various musical gestures, and by comparing these with the subjective characteristics of human perception. An implementation and its optimisation of real-time additive synthesis programs on a multi-transputer network are described. A prototype 81-polyphonic-note- organ configuration was implemented. By devising and deploying monitoring processes, the network's performance was measured and enhanced, leading to an efficient usage; the 88-note configuration. Since 88 simultaneous notes are rarely necessary in most performances, a scheduling program for dynamic note allocation was then introduced to achieve further efficiency gains. Considering calculation redundancies still further, a multi-sampling rate approach was applied as a further step to achieve an optimal performance. The theories underlining sound granulation, as a means of constructing complex sounds from grains, and the real-time implementation of this technique are outlined. The idea of sound granulation is quite similar to the quantum-wave theory, "acoustic quanta". Despite the conceptual simplicity, the signal processing requirements set tough demands, providing a challenge for this audio synthesis engine. Three issues arising from the results of the implementations above are discussed; the efficiency of the applications implemented, provisions for new processors and an optimal network architecture for sound synthesis

    Electromagnetic corrections to pseudoscalar decay constants

    Get PDF
    First principles Lattice quantum chromodynamics (LQCD) calculations enable the determination of low energy hadronic amplitudes. Precision LQCD calculations with relative errors smaller than approximately 1% require the inclusion of electromagnetic effects. We demonstrate that including (quenched) quantum electrodynamics effects in the LQCD calculation effects the values obtained for pseudoscalar decay constants in the per mille range. The importance of systematic effects, including finite volume effects and the charge dependence of renormalization and improvement coefficients, is highlighted

    Technology 2000, volume 1

    Get PDF
    The purpose of the conference was to increase awareness of existing NASA developed technologies that are available for immediate use in the development of new products and processes, and to lay the groundwork for the effective utilization of emerging technologies. There were sessions on the following: Computer technology and software engineering; Human factors engineering and life sciences; Information and data management; Material sciences; Manufacturing and fabrication technology; Power, energy, and control systems; Robotics; Sensors and measurement technology; Artificial intelligence; Environmental technology; Optics and communications; and Superconductivity
    corecore