3,659 research outputs found

    Modeling and synthesis of multicomputer interconnection networks

    Get PDF
    The type of interconnection network employed has a profound effect on the performance of a multicomputer and multiprocessor design. Adequate models are needed to aid in the design and development of interconnection networks. A novel modeling approach using statistical and optimization techniques is described. This method represents an attempt to compare diverse interconnection network designs in a way that allows not only the best of existing designs to be identified but to suggest other, perhaps hybrid, networks that may offer better performance. Stepwise linear regression is used to develop a polynomial surface representation of performance in a (k+1) space with a total of k quantitative and qualitative independent variables describing graph-theoretic characteristics such as size, average degree, diameter, radius, girth, node-connectivity, edge-connectivity, minimum dominating set size, and maximum number of prime node and edge cutsets. Dependent variables used to measure performance are average message delay and the ratio of message completion rate to network connection cost. Response Surface Methodology (RSM) optimizes a response variable from a polynomial function of several independent variables. Steepest ascent path may also be used to approach optimum points

    Expanded delta networks for very large parallel computers

    Get PDF
    In this paper we analyze a generalization of the traditional delta network, introduced by Patel [21], and dubbed Expanded Delta Network (EDN). These networks provide in general multiple paths that can be exploited to reduce contention in the network resulting in increased performance. The crossbar and traditional delta networks are limiting cases of this class of networks. However, the delta network does not provide the multiple paths that the more general expanded delta networks provide, and crossbars are to costly to use for large networks. The EDNs are analyzed with respect to their routing capabilities in the MIMD and SIMD models of computation.The concepts of capacity and clustering are also addressed. In massively parallel SIMD computers, it is the trend to put a larger number processors on a chip, but due to I/O constraints only a subset of the total number of processors may have access to the network. This is introduced as a Restricted Access Expanded Delta Network of which the MasPar MP-1 router network is an example

    A multiarchitecture parallel-processing development environment

    Get PDF
    A description is given of the hardware and software of a multiprocessor test bed - the second generation Hypercluster system. The Hypercluster architecture consists of a standard hypercube distributed-memory topology, with multiprocessor shared-memory nodes. By using standard, off-the-shelf hardware, the system can be upgraded to use rapidly improving computer technology. The Hypercluster's multiarchitecture nature makes it suitable for researching parallel algorithms in computational field simulation applications (e.g., computational fluid dynamics). The dedicated test-bed environment of the Hypercluster and its custom-built software allows experiments with various parallel-processing concepts such as message passing algorithms, debugging tools, and computational 'steering'. Such research would be difficult, if not impossible, to achieve on shared, commercial systems

    A Pure Java Parallel Flow Solver

    Get PDF
    In this paper an overview is given on the "Have Java" project to attain a pure Java parallel Navier-Stokes flow solver (JParNSS) based on the thread concept and remote method invocation (RMI). The goal of this project is to produce an industrial flow solver running on an arbitrary sequential or parallel architecture, utilizing the Internet, capable of handling the most complex 3D geometries as well as flow physics, and also linking to codes in other areas such as aeroelasticity etc. Since Java is completely object-oriented the code has been written in an object-oriented programming (OOP) style. The code also includes a graphics user interface (GUI) as well as an interactive steering package for the parallel architecture. The Java OOP approach provides profoundly improved software productivity, robustness, and security as well as reusability and maintainability. OOP allows code construction similar to the aerodynamic design process because objects can be software coded and integrated, reflecting actual design procedures. In addition, Java is the programming language of the Internet and thus Java is the programming language of the Internet and thus Java objects on disparate machines or even separate networks can be connected. We explain the motivation for the design of JParNSS along with its capabilities that set it apart from other solvers. In the first two sections we present a discussion of the Java language as the programming tool for aerospace applications. In section three the objectives of the Have Java project are presented. In the next section the layer structures of JParNSS are discussed with emphasis on the parallelization and client-server (RMI) layers. JParNSS, like its predecessor ParNSS (ANSI-C), is based on the multiblock idea, and allows for arbitrarily complex topologies. Grids are accepted in GridPro property settings, grids of any size or block number can be directly read by JParNSS without any further modifications, requiring no additional preparation time for the solver input. In the last section, computational results are presented, with emphasis on multiprocessor Pentium and Sun parallel systems run by the Solaris operating system (OS)

    A conflict sense routing protocol and its performance for hypercubes

    Get PDF
    Includes bibliographical references (p. 23-24).Supported by NSF. NSF-DDM-8903385 Supported by ARO. DAAL03-86-K-0171by Emmanouel A. Varvarigos and Dimitri P. Bertsekas

    GPUVerify: A Verifier for GPU Kernels

    Get PDF
    We present a technique for verifying race- and divergence-freedom of GPU kernels that are written in mainstream ker-nel programming languages such as OpenCL and CUDA. Our approach is founded on a novel formal operational se-mantics for GPU programming termed synchronous, delayed visibility (SDV) semantics. The SDV semantics provides a precise definition of barrier divergence in GPU kernels and allows kernel verification to be reduced to analysis of a sequential program, thereby completely avoiding the need to reason about thread interleavings, and allowing existing modular techniques for program verification to be leveraged. We describe an efficient encoding for data race detection and propose a method for automatically inferring loop invari-ants required for verification. We have implemented these techniques as a practical verification tool, GPUVerify, which can be applied directly to OpenCL and CUDA source code. We evaluate GPUVerify with respect to a set of 163 kernels drawn from public and commercial sources. Our evaluation demonstrates that GPUVerify is capable of efficient, auto-matic verification of a large number of real-world kernels

    Design of a fault tolerant airborne digital computer. Volume 1: Architecture

    Get PDF
    This volume is concerned with the architecture of a fault tolerant digital computer for an advanced commercial aircraft. All of the computations of the aircraft, including those presently carried out by analogue techniques, are to be carried out in this digital computer. Among the important qualities of the computer are the following: (1) The capacity is to be matched to the aircraft environment. (2) The reliability is to be selectively matched to the criticality and deadline requirements of each of the computations. (3) The system is to be readily expandable. contractible, and (4) The design is to appropriate to post 1975 technology. Three candidate architectures are discussed and assessed in terms of the above qualities. Of the three candidates, a newly conceived architecture, Software Implemented Fault Tolerance (SIFT), provides the best match to the above qualities. In addition SIFT is particularly simple and believable. The other candidates, Bus Checker System (BUCS), also newly conceived in this project, and the Hopkins multiprocessor are potentially more efficient than SIFT in the use of redundancy, but otherwise are not as attractive
    corecore