2,044 research outputs found

    A bibliography on parallel and vector numerical algorithms

    Get PDF
    This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

    A sweep algorithm for massively parallel simulation of circuit-switched networks

    Get PDF
    A new massively parallel algorithm is presented for simulating large asymmetric circuit-switched networks, controlled by a randomized-routing policy that includes trunk-reservation. A single instruction multiple data (SIMD) implementation is described, and corresponding experiments on a 16384 processor MasPar parallel computer are reported. A multiple instruction multiple data (MIMD) implementation is also described, and corresponding experiments on an Intel IPSC/860 parallel computer, using 16 processors, are reported. By exploiting parallelism, our algorithm increases the possible execution rate of such complex simulations by as much as an order of magnitude

    Scalable parallel communications

    Get PDF
    Coarse-grain parallelism in networking (that is, the use of multiple protocol processors running replicated software sending over several physical channels) can be used to provide gigabit communications for a single application. Since parallel network performance is highly dependent on real issues such as hardware properties (e.g., memory speeds and cache hit rates), operating system overhead (e.g., interrupt handling), and protocol performance (e.g., effect of timeouts), we have performed detailed simulations studies of both a bus-based multiprocessor workstation node (based on the Sun Galaxy MP multiprocessor) and a distributed-memory parallel computer node (based on the Touchstone DELTA) to evaluate the behavior of coarse-grain parallelism. Our results indicate: (1) coarse-grain parallelism can deliver multiple 100 Mbps with currently available hardware platforms and existing networking protocols (such as Transmission Control Protocol/Internet Protocol (TCP/IP) and parallel Fiber Distributed Data Interface (FDDI) rings); (2) scale-up is near linear in n, the number of protocol processors, and channels (for small n and up to a few hundred Mbps); and (3) since these results are based on existing hardware without specialized devices (except perhaps for some simple modifications of the FDDI boards), this is a low cost solution to providing multiple 100 Mbps on current machines. In addition, from both the performance analysis and the properties of these architectures, we conclude: (1) multiple processors providing identical services and the use of space division multiplexing for the physical channels can provide better reliability than monolithic approaches (it also provides graceful degradation and low-cost load balancing); (2) coarse-grain parallelism supports running several transport protocols in parallel to provide different types of service (for example, one TCP handles small messages for many users, other TCP's running in parallel provide high bandwidth service to a single application); and (3) coarse grain parallelism will be able to incorporate many future improvements from related work (e.g., reduced data movement, fast TCP, fine-grain parallelism) also with near linear speed-ups

    Probabilistic structural mechanics research for parallel processing computers

    Get PDF
    Aerospace structures and spacecraft are a complex assemblage of structural components that are subjected to a variety of complex, cyclic, and transient loading conditions. Significant modeling uncertainties are present in these structures, in addition to the inherent randomness of material properties and loads. To properly account for these uncertainties in evaluating and assessing the reliability of these components and structures, probabilistic structural mechanics (PSM) procedures must be used. Much research has focused on basic theory development and the development of approximate analytic solution methods in random vibrations and structural reliability. Practical application of PSM methods was hampered by their computationally intense nature. Solution of PSM problems requires repeated analyses of structures that are often large, and exhibit nonlinear and/or dynamic response behavior. These methods are all inherently parallel and ideally suited to implementation on parallel processing computers. New hardware architectures and innovative control software and solution methodologies are needed to make solution of large scale PSM problems practical

    Medical image tomography: A statistically tailored neural network approach

    Get PDF
    In medical computed tomography (CT) the tomographic images are reconstructed from planar information collected 180∘ to 360∘ around the patient. In clinical applications, the reconstructions are typically produced using a filtered backprojection algorithm. Filtered backprojection methods have limitations that create a high percentage of statistical uncertainty in the reconstructed images. Many techniques have been developed which produce better reconstructions, but they tend to be computationally expensive, and thus, impractical for clinical use;Artificial neural networks (ANN) have been shown to be adept at learning and then simulating complex functional relationships. For medical tomography, a neural network can be trained to produce a reconstructed medical image given the planar data as input. Once trained an ANN can produce an accurate reconstruction very quickly;A backpropagation ANN with statistically derived activation functions has been developed to improve the trainability and generalization ability of a network to produce accurate reconstructions. The tailored activation functions are derived from the estimated probability density functions (p.d.f.s) of the ANN training data set. A set of sigmoid derivative functions are fitted to the p.d.f.s and then integrated to produce the ANN activation functions, which are also estimates of the cumulative distribution functions (c.d.f.s) of the training data. The statistically tailored activation functions and their derivatives are substituted for the logistic function and its derivative that are typically used in backpropagation ANNs;A set of geometric images was derived for training an ANN for cardiac SPECT image reconstruction. The planar projections for the geometric images were simulated using the Monte Carlo method to produce sixty-four 64-quadrant planar views taken 180 about each image. A 4096 x 629 x 4096 architecture ANN was simulated on the MasPar MP-2, a massively parallel single-instruction multiple-data (SIMD) computer. The ANN was trained on the set of geometric tomographic images. Trained on the geometric images, the ANN was able to generalize the input-to-output function of the planar data-to-tomogram and accurately reconstruct actual cardiac SPECT images

    Report from the MPP Working Group to the NASA Associate Administrator for Space Science and Applications

    Get PDF
    NASA's Office of Space Science and Applications (OSSA) gave a select group of scientists the opportunity to test and implement their computational algorithms on the Massively Parallel Processor (MPP) located at Goddard Space Flight Center, beginning in late 1985. One year later, the Working Group presented its report, which addressed the following: algorithms, programming languages, architecture, programming environments, the way theory relates, and performance measured. The findings point to a number of demonstrated computational techniques for which the MPP architecture is ideally suited. For example, besides executing much faster on the MPP than on conventional computers, systolic VLSI simulation (where distances are short), lattice simulation, neural network simulation, and image problems were found to be easier to program on the MPP's architecture than on a CYBER 205 or even a VAX. The report also makes technical recommendations covering all aspects of MPP use, and recommendations concerning the future of the MPP and machines based on similar architectures, expansion of the Working Group, and study of the role of future parallel processors for space station, EOS, and the Great Observatories era

    Modeling of Topologies of Interconnection Networks based on Multidimensional Multiplicity

    Get PDF
    Modern SoCs are becoming more complex with the integration of heterogeneous components (IPs). For this purpose, a high performance interconnection medium is required to handle the complexity. Hence NoCs come into play enabling the integration of more IPs into the SoC with increased performance. These NoCs are based on the concept of Interconnection networks used to connect parallel machines. In response to the MARTE RFP of the OMG, a notation of multidimensional multiplicity has been proposed which permits to model repetitive structures and topologies. This report presents a modeling methodology based on this notation that can be used to model a family of Interconnection Networks called Delta Networks which in turn can be used for the construction of NoCs

    Four-quadrant one-transistor-synapse for high-density CNN implementations

    Get PDF
    Presents a linear four-quadrants, electrically-programmable, one-transistor synapse strategy applicable to the implementation of general massively-parallel analog processors in CMOS technology. It is specially suited for translationally-invariant processing arrays with local connectivity, and results in a significant reduction in area occupation and power dissipation of the basic processing units. This allows higher integration densities and therefore, permits the integration of larger arrays on a single chip.Comisión Interministerial de Ciencia y Tecnología TIC96-1392-C02-0
    corecore