6 research outputs found

    Parallel Algorithms for Spatial Rainfall Distribution

    Get PDF
    This paper proposes parallel algorithms for precipitation of flood modelling, especially applied in spatial rainfall distribution. As an important input in flood modelling, spatial distribution of rainfall is always needed as a pre-conditioned model. In this paper two interpolation methods, Inverse distance weighting (IDW) and Ordinary kriging (OK) are discussed. Both are developed in parallel algorithms in order to reduce the computational time. To measure the computation efficiency, the performance of the parallel algorithms are compared to the serial algorithms for both methods. Findings indicate that: (1) the computation time of OK algorithm is up to 23% longer than IDW; (2) the computation time of OK and IDW algorithms is linearly increasing with the number of cells/ points; (3) the computation time of the parallel algorithms for both methods is exponentially decaying with the number of processors. The parallel algorithm of IDW gives a decay factor of 0.52, while OK gives 0.53; (4) The parallel algorithms perform near ideal speed-up

    Performance Analysis of Numerical Problems on a Loosely Coupled System

    Get PDF
    Coordinated Science Laboratory was formerly known as Control Systems LaboratorySemiconductor Research Corporation / 86-12-109Texas Instruments, Inc

    Fast Fourier Transform algorithm design and tradeoffs

    Get PDF
    The Fast Fourier Transform (FFT) is a mainstay of certain numerical techniques for solving fluid dynamics problems. The Connection Machine CM-2 is the target for an investigation into the design of multidimensional Single Instruction Stream/Multiple Data (SIMD) parallel FFT algorithms for high performance. Critical algorithm design issues are discussed, necessary machine performance measurements are identified and made, and the performance of the developed FFT programs are measured. Fast Fourier Transform programs are compared to the currently best Cray-2 FFT program

    Experimental Benchmarks and Initial Evaluation of the Performance of the PASM System Prototype

    Get PDF
    The work reported here represents experiences with the PASM parallel processing system prototype during its first operational year. Most of the experiments were performed by students in the Fall semester of 1987. The first programming, and the first timing measurements, were made during the summer of 1987 by Sam Fineberg. The goal of the collection of experiments presented here was to undertake an Application-driven Architecture Study of the PASM system as a paradigm for parallel architecture evaluation in general. PASM was an excellent vehicle for experimenting with this evaluation technique due to its unique architectural features. Among these are: 1. A reconfigurable, partitionable multistage circuit-switched network. 2. Support for both SIMD and MIMD programs. 3. Ability to execute hybrid SIMD/MIMD programs. 4. An instruction queue which allows overlap of control-flow and data manipulation between micro-control (MC) units and processing elements (PE). It had been hypothesized that superlinear speed-up over the number of PEs could be attained with this feature, and experimental results verified this. 5. Support for barrier synchronization of MIMD tasks. This feature was exploited in some non-standard ways to show the ability to decouple variant length SIMD instructions into multiple MIMD streams for an overall performance benefit. This type of study is expected to continue in the future on PASM and other parallel machines at Purdue. This report should serve as a guide for this future work as well

    Automatic visual recognition using parallel machines

    Get PDF
    Invariant features and quick matching algorithms are two major concerns in the area of automatic visual recognition. The former reduces the size of an established model database, and the latter shortens the computation time. This dissertation, will discussed both line invariants under perspective projection and parallel implementation of a dynamic programming technique for shape recognition. The feasibility of using parallel machines can be demonstrated through the dramatically reduced time complexity. In this dissertation, our algorithms are implemented on the AP1000 MIMD parallel machines. For processing an object with a features, the time complexity of the proposed parallel algorithm is O(n), while that of a uniprocessor is O(n2). The two applications, one for shape matching and the other for chain-code extraction, are used in order to demonstrate the usefulness of our methods. Invariants from four general lines under perspective projection are also discussed in here. In contrast to the approach which uses the epipolar geometry, we investigate the invariants under isotropy subgroups. Theoretically speaking, two independent invariants can be found for four general lines in 3D space. In practice, we show how to obtain these two invariants from the projective images of four general lines without the need of camera calibration. A projective invariant recognition system based on a hypothesis-generation-testing scheme is run on the hypercube parallel architecture. Object recognition is achieved by matching the scene projective invariants to the model projective invariants, called transfer. Then a hypothesis-generation-testing scheme is implemented on the hypercube parallel architecture

    FFT algorithms for SIMD parallel processing systems

    No full text
    SIMD (single instruction stream- multiple data stream) algorithms for one- and two-dimensional discrete Fourier transforms are presented. Parallel structurings of algorithms for efficient computation for a variety of machine size/problem size combinations are presented and analyzed. Through these algorithms, techniques for exploiting relationships between problem size and machine size are demonstrated. The algorithms are evaluated in terms of the number of arithmetic operations and interprocessor data transfers required. The ability of various interconnection net-works presented in the literature to perform the needed transfers is examined. It is shown that the efficiency of a particular data distribution/algorithm decomposition approach is a function of the machine-size/problem-size relationship. 0 1986 Aca-demic Ress, Inc. I
    corecore