475 research outputs found

    Highly parallel computation

    Get PDF
    Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed

    Architecture independent environment for developing engineering software on MIMD computers

    Get PDF
    Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management

    Uintah parallelism infrastructure: a performance evaluation on the SGI origin 2000

    Get PDF
    ManuscriptUintah is a component-based visual problem solving environment (PSE) designed to specifically address the unique problems inherent in running massively parallel scientific computations on terascale computing platforms. In particular, development of the Uintah system is part of the C-SAFE [2] effort to study the interactions between hydrocarbon fires, structures and high-energy materials (explosives and propellants). In this paper we describe methods for generating meaningful performance measurements for the Uintah PSE runing on the SGI Origin 2000 multiprocessor architecture (these methods are applicable to many other applications.) These techniques include utilizing the non-intrusive performance counters built into the R10k and R12k processors, controlling process placement, controlling memory layout, and utilization of a task graph approach to specifying and solving the problem

    Computational experiments with an asynchronous parallel branch and bound algorithm

    Get PDF
    In this paper we present an asynchronous branch and bound algorithm for execution on an MIMD system, state sufficient conditions to prevent the parallelism from degrading the performance of this algorithm, and investigate the consequences of having the algorithm executed by nonhomogeneous processing elements. We introduce the notions of perfect parallel time and achieved efficiency to empirically measure the effects of parallelism, because the traditional notions of speedup and processor utilization are not adequate for fully characterizing the actual execution of an asynchronous parallel branch and bound algorithm. Finally we present some computational results obtained for the symmetric traveling salesman problem

    Parallel branch and bound on an MIMD system

    Get PDF
    In this paper we give a classification of parallel branch and bound algorithms and develop a class of asynchronous branch and bound algorithms for execution on an MIMD system. We develop sufficient conditions to prevent the anomalies that can occur due to the parallelism, the asynchronicity or the nondeter- minism, from degrading the performance of the algorithm. Such conditions were known already for the synchronous case. It turns out that these conditions are sufficient for asynchronous algorithms as well. We also investigate the consequences of nonhomogeneous processing elements in a parallel computer system. We introduce the notions of perfect parallel time and achieved efficiency to empirically measure the effects of parallelism, because the traditional notions of speedup and efficiency are not capable of fully characterizing the actual execution of an asyn-chronous parallel algorithm. Finally we present some computational results obtained for the symmetric traveling salesman problem

    Detection of dependence patterns with delay

    Get PDF
    The Unitary Events (UE) method is a popular and efficient method used this last decade to detect dependence patterns of joint spike activity among simultaneously recorded neurons. The first introduced method is based on binned coincidence count \citep{Grun1996} and can be applied on two or more simultaneously recorded neurons. Among the improvements of the methods, a transposition to the continuous framework has recently been proposed in \citep{muino2014frequent} and fully investigated in \citep{MTGAUE} for two neurons. The goal of the present paper is to extend this study to more than two neurons. The main result is the determination of the limit distribution of the coincidence count. This leads to the construction of an independence test between L≥2L\geq 2 neurons. Finally we propose a multiple test procedure via a Benjamini and Hochberg approach \citep{Benjamini1995}. All the theoretical results are illustrated by a simulation study, and compared to the UE method proposed in \citep{Grun2002}. Furthermore our method is applied on real data

    A TIME-AND-SPACE PARALLELIZED ALGORITHM FOR THE CABLE EQUATION

    Get PDF
    Electrical propagation in excitable tissue, such as nerve fibers and heart muscle, is described by a nonlinear diffusion-reaction parabolic partial differential equation for the transmembrane voltage V(x,t)V(x,t), known as the cable equation. This equation involves a highly nonlinear source term, representing the total ionic current across the membrane, governed by a Hodgkin-Huxley type ionic model, and requires the solution of a system of ordinary differential equations. Thus, the model consists of a PDE (in 1-, 2- or 3-dimensions) coupled to a system of ODEs, and it is very expensive to solve, especially in 2 and 3 dimensions. In order to solve this equation numerically, we develop an algorithm, extended from the Parareal Algorithm, to efficiently incorporate space-parallelized solvers into the framework of the Parareal algorithm, to achieve time-and-space parallelization. Numerical results and comparison of the performance of several serial, space-parallelized and time-and-space-parallelized time-stepping numerical schemes in one-dimension and in two-dimensions are also presented

    Efficient GPU implementation of a two waves TVD-WAF method for the two-dimensional one layer Shallow Water system on structured meshes.

    Get PDF
    The numerical solutions of shallow water equations are useful for applications related to geophysical flows that usually take place in large computational domains and could require real time calculation. Therefore, parallel versions of accurate and efficient numerical solvers for high performance platforms are needed to be able to deal with these simulation scenarios in reasonable times. In this paper we present an efficient CUDA implementation of a first and second order HLL methods and a two-waves TVD-WAF one. We propose to write all these methods under a common framework, such as, their CUDA implementations share the same structure. In particular, the reformulation of TVD-WAF numerical flux and the improved definition of the flux limiter allows us to obtain a more robust solver in situations like wet/dry fronts. Finally, some numerical tests are presented showing that the TVD-WAF method is slightly slower that the first order HLL method and two times faster than the second order HLL method, but it provides numerical results almost as accurate as the second order HLL scheme
    • …
    corecore