Search CORE

657 research outputs found

A bibliography on parallel and vector numerical algorithms

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

NASA Technical Reports Server

A Jacobi-based algorithm for computing symmetric eigenvalues and eigenvectors in a two-dimensional mesh

Author: González Colás Antonio María
Royo Vallés María Dolores
Valero García Miguel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

The paper proposes an algorithm for computing symmetric eigenvalues and eigenvectors that uses a one-sided Jacobi approach and is targeted to a multicomputer in which nodes can be arranged as a two-dimensional mesh with an arbitrary number of rows and columns. The algorithm is analysed through simple analytical models of execution time, which show that an adequate choice of the mesh configuration (number of rows and columns) can improve performance significantly, with respect to a one-dimensional configuration, which is the most frequently considered scenario in current proposals. This improvement is especially noticeable in large systems.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Activities of the Institute for Computer Applications in Science and Engineering

Author
Publication venue
Publication date
Field of study

Research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, numerical analysis, and computer science during the period April 1, 1985 through October 2, 1985 is summarized

NASA Technical Reports Server

Adaptive BDDC in Three Dimensions

Author: Amestoy
Bedřich Sousedík
Brenner
Demmel
Dohrmann
Farhat
Farhat
Fish
Fragakis
Jakub Šístek
Jan Mandel
Klawonn
Klawonn
Klawonn
Klawonn
Knyazev
Kruis
Le Tallec
Li
Mandel
Mandel
Mandel
Mandel
Mandel
Mandel
Mandel
Mandel
Pechstein
Pechstein
Poole
Smith
Sousedík
Toselli
Šístek
Publication venue: 'Elsevier BV'
Publication date: 28/02/2011
Field of study

The adaptive BDDC method is extended to the selection of face constraints in three dimensions. A new implementation of the BDDC method is presented based on a global formulation without an explicit coarse problem, with massive parallelism provided by a multifrontal solver. Constraints are implemented by a projection and sparsity of the projected operator is preserved by a generalized change of variables. The effectiveness of the method is illustrated on several engineering problems.Comment: 28 pages, 9 figures, 9 table

arXiv.org e-Print Archive

Crossref

Hypercube algorithms on mesh connected multicomputers

Author: Díaz de Cerio Ripalda Luis Manuel
González Colás Antonio María
Valero García Miguel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

A new methodology named CALMANT (CC-cube Algorithms on Meshes and Tori) for mapping a type of algorithm that we call CC-cube algorithm onto multicomputers with hypercube, mesh, or torus interconnection topology is proposed. This methodology is suitable when the initial problem can be expressed as a set of processes that communicate through a hypercube topology (a CC-cube algorithm). There are many important algorithms that fit into the CC-cube type. CALMANT is based on three different techniques: (a) the standard embedding to assign the processes of the algorithm to the nodes of the mesh multicomputer; (b) the communication pipelining technique to increase the level of communication parallelism inherent in the CC-cube algorithms; and (c) optimal message-scheduling algorithms proposed in this work in order to avoid conflicts and minimizing in this way the communication time. Although CALMANT is proposed for multicomputers with different interconnection network topologies, the paper only focuses on the particular case of meshes.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Statistical Numerical Methods for Eigenvalue Problem. Parallel Implementation

Author: Dimov Ivan
Karaivanova Aneta
Publication venue: Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Publication date: 01/01/2000
Field of study

MSC subject classification: 65C05, 65U05.The problem of evaluating the smallest eigenvalue of real symmetric matrices using statistical numerical methods is considered

Bulgarian Digital Mathematics Library at IMI-BAS

Summary of research in applied mathematics, numerical analysis, and computer sciences

Author
Publication venue
Publication date
Field of study

The major categories of current ICASE research programs addressed include: numerical methods, with particular emphasis on the development and analysis of basic numerical algorithms; control and parameter identification problems, with emphasis on effective numerical methods; computational problems in engineering and physical sciences, particularly fluid dynamics, acoustics, and structural analysis; and computer systems and software, especially vector and parallel computers

NASA Technical Reports Server

A transient FETI methodology for large-scale parallel implicit computations in structural mechanics, part 2

Author: Crivelli Luis
Farhat Charbel
Publication venue
Publication date
Field of study

Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallellize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet and perhaps will never be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient than explicit codes when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system

NASA Technical Reports Server

On-chip implementation of multiprocessor networks and switch fabrics

Author: De Micheli Giovanni
Ye Terry Tao
Publication venue: 'Inderscience Publishers'
Publication date: 02/01/2009
Field of study

On-chip implementation of multiprocessor systems needs to planarise the interconnect networks onto the silicon floorplan. Compared with traditional ASIC/SoC architectures, Multiprocessor Systems on Chips (MPSoC) node processors are homogeneous, and MPSoC network topologies are regular. Therefore, traditional ASIC floorplanning methodologies that perform macro placement are not suitable for MPSoC designs. We propose an automated MPSoC physical planning methodology. REGULAY can generate an optimal floorplan for different topologies under different design constraints. Compared with traditional floorplanning approaches, REGULAY shows significant advantages in reducing the total interconnect wirelength while preserving the regularity and hierarchy of the network topology

Infoscience - École polytechnique fédérale de Lausanne