9 research outputs found
Recommended from our members
The Connected Component Algorithm on The NON-VON Supercomputer
The NON-VON Supercomputer is a highly parallel tree-structured computer that is being Implemented at Columbia University. In this paper, we demonstrate that tree architectures with their favorable characteristics for VLSI Implementation, and fast global broadcast, lend themselves easily and naturally to the representation and manipulation of Images represented by hierarchical data structures A description of NON-VON architecture IS presented With an emphasis on the special architectural features that will be used m our Image understanding algorithms. We adopt a variation of the quad tree data structure, called the binary Image tree, to represent images in the NON-VON tree We show how Images are loaded in the NON-VON tree, and present the algorithm for budding the binary Image trees. An efficient Implementation of the connected component labeling algorithm on NON-VON is then presented Simulation results are discussed, and we show the fast execution time of the algorithm on NON-VON. Other algorithms are also developed, such as hlstogrammlng, Hough transform, Set operations and Image correlation, and we can conclude that NON-VON can be used to Implement efficiently several :important Image understanding task
Probabilistic structural mechanics research for parallel processing computers
Aerospace structures and spacecraft are a complex assemblage of structural components that are subjected to a variety of complex, cyclic, and transient loading conditions. Significant modeling uncertainties are present in these structures, in addition to the inherent randomness of material properties and loads. To properly account for these uncertainties in evaluating and assessing the reliability of these components and structures, probabilistic structural mechanics (PSM) procedures must be used. Much research has focused on basic theory development and the development of approximate analytic solution methods in random vibrations and structural reliability. Practical application of PSM methods was hampered by their computationally intense nature. Solution of PSM problems requires repeated analyses of structures that are often large, and exhibit nonlinear and/or dynamic response behavior. These methods are all inherently parallel and ideally suited to implementation on parallel processing computers. New hardware architectures and innovative control software and solution methodologies are needed to make solution of large scale PSM problems practical
The Honeycomb Architecture: Prototype Analysis and Design
Due to the inherent potential of parallel processing, a lot of attention has focused on massively parallel computer architecture. To a large extent, the performance of a massively parallel architecture is a function of the flexibility of its communication network. The ability to configure the topology of the machine determines the ease with which problems are mapped onto the architecture. If the machine is sufficiently flexible, the architecture can be configured to match the natural structure of a wide range of problems. There are essentially four unique types of massively parallel architectures: 1. Cellular Arrays 2. Lattice Architectures [21, 30] 3. Connection Architectures [19] 4. Honeycomb Architectures [24] All four architectures are classified as SIMD. Each, however, offers a slightly different solution to the mapping problem. The first three approaches are characterized by easily distinguishable processor, communication, and memory components. In contrast, the Honeycomb architecture contains multipurpose processing/communication/memory cells. Each cell can function as either a simple CPU, a memory cell, or an element of a communication bus. The conventional approach to massive parallelism is the cellular array. It typically consists of an array of processing elements arranged in a mesh pattern with hard wired connections between neighboring processors. Due to their fixed topology, cellular arrays impose severe limitations upon interprocessor communication. The lattice architecture is a somewhat more flexible approach to massive parallelism. It consists of a lattice of processing elements embedded in an array of simple switching elements. The switching elements form a programmable interconnection network. A lattice architecture can be configured in a number of different topologies, but it is still only a partial solution to the mapping problem. The connection architecture offers a comprehensive solution to the mapping problem. It consists of a cellular array integrated into a packet-switched communication network. The network provides transparent communication between all processing elements. Note that the communication network is physically abstracted from the processor array, allowing the processors to evolve independently of the network. The Honeycomb architecture offers a unique solution to the mapping problem. It consists of an array of identical processing/communication/memory cells. Each cell can function as either a processor cell, a communication cell, or a memory cell. Collections of Honeycomb cells can be grouped into multicell CPUs, multi-cell memories, or multi-cell CPU-memory systems. Multi-cell CPU-memory systems are hereafter referred to as processing clusters. The topology of the Honeycomb is determined at compilation time. During a preprocessing phase, the Honeycomb is adjusted to the desired topology. The Honeycomb cell is extremely simple, capable of only simple arithmetic and logic operations. The simplicity of the Honeycomb cell is the key to the Honeycomb concept. As indicated in [24], there are two main research avenues to pursue in furthering the Honeycomb concept: 1. Analyzing the design of a uniform Honeycomb cell 2. Mapping algorithms onto the Honeycomb architecture This technical report concentrates on the first issue. While alluded to throughout the report, the second issue is not addressed in any detail
Vision algorithms for hypercube machines
Several commercial hypercube parallel processors with the potential to deliver massive parallelism cost-effectively have been announced recently. They open the door to a wide variety of application areas that could benefit from parallelism. Computer vision is one of these application areas. This paper develops a general model for hypercube machines, and uses it to show how vision algorithms can be executed on hypercubes. In particular, the steps in the problem of thick-film inspection are used as a concrete example. The time needed to complete a typical inspection is used to demonstrate the performance of hypercube machines. Experimental results from a hypercube machine illustrate the potential use of such machines.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/26820/1/0000379.pd
Recommended from our members
On the Application of Massively Parallel SIMD Tree Machines to Certain Intermediate-Level Vision Tasks
In this paper, we examine the implementation of two middle-level image understanding tasks on fine-grained tree-structured SIMD machines, which have highly efficient VLSI implementations. We first present one such massively parallel machine called NON-VON, and summarize the cost/performance trade-offs of such machines for vision taks. We follow with a more detailed description of the NON-VON architecture (a prototype of which has been operational since January 1985), and of the high-level parallel language in which our algorithms have been written and simulated. The heart of the paper consists of the description and analysis of algorithms for a representative Hough transform, and of an algorithm for the interpretation of moving light displays. Novel algorithmic techniques are motivated and described, and simulation timings are presented and discussed. We conclude that it is possible to exploit the available massive parallelism while avoiding many of the communication bottlenecks common at this level of image understanding, by carefully and inexpensively duplicating data and/or control information, and by delaying or avoiding the reporting of intermediate results
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems
Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems
On the synthesis and processing of high quality audio signals by parallel computers
This work concerns the application of new computer architectures to the creation and manipulation of high-quality audio bandwidth signals. The configuration of both the hardware and software in such systems falls under consideration in the three major sections which present increasing levels of algorithmic concurrency. In the first section, the programs which are described are distributed in identical copies across an array of processing elements; these programs run autonomously, generating data independently, but with control parameters peculiar to each copy: this type of concurrency is referred to as isonomic}The central section presents a structure which distributes tasks across an arbitrary network of processors; the flow of control in such a program is quasi- indeterminate, and controlled on a demand basis by the rate of completion of the slave tasks and their irregular interaction with the master. Whilst that interaction is, in principle, deterministic, it is also data-dependent; the dynamic nature of task allocation demands that no a priori knowledge of the rate of task completion be required. This type of concurrency is called dianomic? Finally, an architecture is described which will support a very high level of algorithmic concurrency. The programs which make efficient use of such a machine are designed not by considering flow of control, but by considering flow of data. Each atomic algorithmic unit is made as simple as possible, which results in the extensive distribution of a program over very many processing elements. Programs designed by considering only the optimum data exchange routes are said to exhibit systolic^ concurrency. Often neglected in the study of system design are those provisions necessary for practical implementations. It was intended to provide users with useful application programs in fulfilment of this study; the target group is electroacoustic composers, who use digital signal processing techniques in the context of musical composition. Some of the algorithms in use in this field are highly complex, often requiring a quantity of processing for each sample which exceeds that currently available even from very powerful computers. Consequently, applications tend to operate not in 'real-time' (where the output of a system responds to its input apparently instantaneously), but by the manipulation of sounds recorded digitally on a mass storage device. The first two sections adopt existing, public-domain software, and seek to increase its speed of execution significantly by parallel techniques, with the minimum compromise of functionality and ease of use. Those chosen are the general- purpose direct synthesis program CSOUND, from M.I.T., and a stand-alone phase vocoder system from the C.D.P..(^4) In each case, the desired aim is achieved: to increase speed of execution by two orders of magnitude over the systems currently in use by composers. This requires substantial restructuring of the programs, and careful consideration of the best computer architectures on which they are to run concurrently. The third section examines the rationale behind the use of computers in music, and begins with the implementation of a sophisticated electronic musical instrument capable of a degree of expression at least equal to its acoustic counterparts. It seems that the flexible control of such an instrument demands a greater computing resource than the sound synthesis part. A machine has been constructed with the intention of enabling the 'gestural capture' of performance information in real-time; the structure of this computer, which has one hundred and sixty high-performance microprocessors running in parallel, is expounded; and the systolic programming techniques required to take advantage of such an array are illustrated in the Occam programming language
Superquadric Description on Large Arrays of Bit-serial Processors
This study describes the parallel implementation of a new computer vision technique, superquadric description. The use of superquadric primitives to extend the power of Constructive Solid Geometry for Computer Aided Design purposes was first proposed in [BARR 84]. The application of this technique for machine vision purposes was first published by Alex Pentland in [PENTL 86b]. This study developed a parallel least-squares solution technique to solve a slightly modified form of the regression equations originally derived in [PENTL 86b]. This technique is intended for execution on large arrays of bit-serial processors. Several ways have been suggested to interconnect the processing elements in such arrays, therefore the performance of this technique was estimated for three interconnection networks.Electrical and Computer Engineerin
An integrated associative processing system
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.Includes bibliographical references (p. 97-105).by Frederick Paul Herrmann.Ph.D