2,190 research outputs found

    A bibliography on parallel and vector numerical algorithms

    Get PDF
    This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

    Solution of partial differential equations on vector and parallel computers

    Get PDF
    The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

    Density Functional Theory calculation on many-cores hybrid CPU-GPU architectures

    Get PDF
    The implementation of a full electronic structure calculation code on a hybrid parallel architecture with Graphic Processing Units (GPU) is presented. The code which is on the basis of our implementation is a GNU-GPL code based on Daubechies wavelets. It shows very good performances, systematic convergence properties and an excellent efficiency on parallel computers. Our GPU-based acceleration fully preserves all these properties. In particular, the code is able to run on many cores which may or may not have a GPU associated. It is thus able to run on parallel and massive parallel hybrid environment, also with a non-homogeneous ratio CPU/GPU. With double precision calculations, we may achieve considerable speedup, between a factor of 20 for some operations and a factor of 6 for the whole DFT code.Comment: 14 pages, 8 figure

    Partitioning strategy for efficient nonlinear finite element dynamic analysis on multiprocessor computers

    Get PDF
    A computational procedure is presented for the nonlinear dynamic analysis of unsymmetric structures on vector multiprocessor systems. The procedure is based on a novel hierarchical partitioning strategy in which the response of the unsymmetric and antisymmetric response vectors (modes), each obtained by using only a fraction of the degrees of freedom of the original finite element model. The three key elements of the procedure which result in high degree of concurrency throughout the solution process are: (1) mixed (or primitive variable) formulation with independent shape functions for the different fields; (2) operator splitting or restructuring of the discrete equations at each time step to delineate the symmetric and antisymmetric vectors constituting the response; and (3) two level iterative process for generating the response of the structure. An assessment is made of the effectiveness of the procedure on the CRAY X-MP/4 computers

    General method of synthesis by PLIC/FPGA digital devices to perform discrete orthogonal transformations

    Get PDF
    A general method is proposed to synthesize digital devices in order to perform discrete orthogonal transformations (DOT) on programmable logic integrated circuits (PLIC) of FPGA class. The basic and the most "slow" operation during DOT performance is the operation of multiplying by a constant factor (constant) - OMC. To perform DOT digital devices are implemented at the use of the same type of IP-cores, which allow to realize OMC. According to the proposed method, OMC is determined on the basis of picturing set over the elements of the Galois field. Due to the distributed computing of nonlinear polynomial function systems defined over the Galois field in PLIC/FPGA architecture, the reduction in the estimates of time complexity concerning OMC performance is achieved. Each non-linear polynomial function, like OMC, is realized on the basis of the same type of IP-cores according to one of the structural schemes in accordance with the requirements for the device to perform DOT. The use of IP cores significantly reduces the cost of designing a device that implements DOT in the PLIC/FPGA architecture.Keywords: digital signal processing, discrete orthogonal transformations, distributed computing, nonlinear polynomial functions, Galois fields, FPGAs, digital device

    Solving large sparse eigenvalue problems on supercomputers

    Get PDF
    An important problem in scientific computing consists in finding a few eigenvalues and corresponding eigenvectors of a very large and sparse matrix. The most popular methods to solve these problems are based on projection techniques on appropriate subspaces. The main attraction of these methods is that they only require the use of the matrix in the form of matrix by vector multiplications. The implementations on supercomputers of two such methods for symmetric matrices, namely Lanczos' method and Davidson's method are compared. Since one of the most important operations in these two methods is the multiplication of vectors by the sparse matrix, methods of performing this operation efficiently are discussed. The advantages and the disadvantages of each method are compared and implementation aspects are discussed. Numerical experiments on a one processor CRAY 2 and CRAY X-MP are reported. Possible parallel implementations are also discussed
    corecore