7 research outputs found
On the average running time of odd-even merge sort
This paper is concerned with the average running time of Batcher's odd-even merge sort when implemented on a collection of processors. We consider the case where , the size of the input, is an arbitrary multiple of the number of processors used. We show that Batcher's odd-even merge (for two sorted lists of length each) can be implemented to run in time on the average, and that odd-even merge sort can be implemented to run in time on the average. In the case of merging (sorting), the average is taken over all possible outcomes of the merging (all possible permutations of elements). That means that odd-even merge and odd-even merge sort have an optimal average running time if . The constants involved are also quite small
A taxonomy of parallel sorting
TR 84-601In this paper, we propose a taxonomy of parallel sorting that includes a broad range of array
and file sorting algorithms. We analyze the evolution of research on parallel sorting, from the
earliest sorting networks to the shared memory algorithms and the VLSI sorters. In the context
of sorting networks, we describe two fundamental parallel merging schemes - the odd-even and
the bitonic merge. Sorting algorithms have been derived from these merging algorithms for parallel
computers where processors communicate through interconnection networks such as the perfect
shuffle, the mesh and a number of other sparse networks. After describing the network sorting
algorithms, we show that, with a shared memory model of parallel computation, faster algorithms
have been derived from parallel enumeration sorting schemes, where keys are first ranked and
then rearranged according to their rank
Parallel Computation on Hypercube-Like Machines.
The hypercube interconnection network has been recognized to be very suitable for a parallel computing architecture due to its attractive topological properties. Recently, several modified hypercubes have been propose to improve the performance of a hypercube. This dissertation deals with two modified hypercubes, the X-hypercube and the Z-cube. The X-hypercube is a variant of the hypercube, with the same amount of hardware but a diameter of only (n + 1)/2 in a hypercube of dimension n. The Z-cube has only 75 percent of the edges of a hypercube with the same number vertices and the same diameter as the hypercube. In this dissertation, we investigate some topological properties and the effectiveness of the X-hypercube and the Z-cube in their combinatorial and computational aspects. We give the optimal or nearly optimal data communication algorithms including routing, broadcasting, and census function for the X-hypercube and the Z-cube. We also give the optimal embedding algorithms between the X-hypercube and the hypercube. It is shown that the average distance between vertices in a X-hypercube is roughly 13/16 of that in a hypercube. This implies that a X-hypercube achieves the better average communication performance than a hypercube. In addition, a set of fundamental SIMD algorithms for a X-hypercube is given. Our results indicate that the X-hypercube makes an improvement in performance over the hypercube, but not as much as the reduction in a diameter, and the Z-cube is a good alternative for the hypercube as far as the VLSI implementation is of major concern
New Techniques in Scene Understanding and Parallel Image Processing.
There has been tremendous research interest in the areas of computer and robotic vision. Scene understanding and parallel image processing are important paradigms in computer vision. New techniques are presented to solve some of the problems in these paradigms. Automatic interpretation of features in a natural scene is the focus of the first part of the dissertation. The proposed interpretation technique consists of a context dependent feature labeling algorithm using non linear probabilistic relaxation, and an expert system. Traditionally, the output of the labeling is analyzed, and then recognized by a high level interpreter. In this new approach, the knowledge about the scene is utilized to resolve the inconsistencies introduced by the labeling algorithm. A feature labeling system based on this hybrid technique is designed and developed. The labeling system plays a vital role in the development of an automatic image interpretation system for oceanographic satellite images. An extensive study on the existing interpretation techniques has been made in the related areas such as remote sensing, medical diagnosis, astronomy, and oceanography and has shown that our hybrid approach is unique and powerful. The second part of the dissertation presents the results in the area of parallel image processing. A new approach for parallelizing vision tasks in the low and intermediate levels is introduced. The technique utilizes schemes to embed the inherent data or computational structure, used to solve the problem, into parallel architectures such as hypercubes. The important characteristic of the technique is that the adjacent pixels in the image are mapped to nodes that are at a constant distance in the hypercube. Using the technique, parallel algorithms for neighbor-finding and digital distances are developed. A parallel hypercube sorting algorithm is obtained as an illustration of the technique. The research in developing these embedding algorithms has paved the way for efficient reconfiguration algorithms for hypercube architectures
Parallel alogorithms for MIMD parallel computers
This thesis mainly covers the design and analysis of asynchronous
parallel algorithms that can be run on MIMD (Multiple Instruction
Multiple Data) parallel computers, in particular the NEPTUNE system at
Loughborough University. Initially the fundamentals of parallel computer
architectures are introduced with different parallel architectures being
described and compared. The principles of parallel programming and the
design of parallel algorithms are also outlined. Also the main
characteristics of the 4 processor MIMD NEPTUNE system are presented,
and performance indicators, i.e. the speed-up and the efficiency factors
are defined for the measurement of parallelism in a given system.
Both numerical and non-numerical algorithms are covered in the
thesis. In the numerical solution of partial differential equations,
a new parallel 9-point block iterative method is developed. Here, the
organization of the blocks is done in such a way that each process
contains its own group of 9 points on the network, therefore, they can
be run in parallel. The parallel implementation of both 9-point and 4-
point block iterative methods were programmed using natural and redblack
ordering with synchronous and asynchronous approaches. The
results obtained for these different implementations were compared and
analysed.
Next the parallel version of the A.G.E. (Alternating Group Explicit)
method is developed in which the explicit nature of the difference
equation is revealed and exploited when applied to derive the solution
of both linear and non-linear 2-point boundary value problems. Two
strategies have been used in the implementation of the parallel A.G.E.
method using the synchronous and asynchronous approaches. The results
from these implementations were compared. Also for comparison reasons
the results obtained from the parallel A.G.E. were compared with the ~
corresponding results obtained from the parallel versions of the Jacobi,
Gauss-Seidel and S.O.R. methods. Finally, a computational complexity
analysis of the parallel A.G.E. algorithms is included.
In the area of non-numeric algorithms, the problems of sorting and
searching were studied. The sorting methods which were investigated
was the shell and the digit sort methods. with each method different
parallel strategies and approaches were used and compared to find the
best results which can be obtained on the parallel machine.
In the searching methods, the sequential search algorithm in an
unordered table and the binary search algorithms were investigated and
implemented in parallel with a presentation of the results. Finally,
a complexity analysis of these methods is presented.
The thesis concludes with a chapter summarizing the main results
The connection machine
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1988.Bibliography: leaves 134-157.by William Daniel Hillis.Ph.D
Future Computer Requirements for Computational Aerodynamics
Recent advances in computational aerodynamics are discussed as well as motivations for and potential benefits of a National Aerodynamic Simulation Facility having the capability to solve fluid dynamic equations at speeds two to three orders of magnitude faster than presently possible with general computers. Two contracted efforts to define processor architectures for such a facility are summarized