Search CORE

2,269 research outputs found

Problems related to the integration of fault tolerant aircraft electronic systems

Author: Adlakha V.
Alspaugh T. A., Jr.
Bannister J. A.
Triyedi K.
Publication venue
Publication date
Field of study

Problems related to the design of the hardware for an integrated aircraft electronic system are considered. Taxonomies of concurrent systems are reviewed and a new taxonomy is proposed. An informal methodology intended to identify feasible regions of the taxonomic design space is described. Specific tools are recommended for use in the methodology. Based on the methodology, a preliminary strawman integrated fault tolerant aircraft electronic system is proposed. Next, problems related to the programming and control of inegrated aircraft electronic systems are discussed. Issues of system resource management, including the scheduling and allocation of real time periodic tasks in a multiprocessor environment, are treated in detail. The role of software design in integrated fault tolerant aircraft electronic systems is discussed. Conclusions and recommendations for further work are included

NASA Technical Reports Server

A bibliography on parallel and vector numerical algorithms

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

NASA Technical Reports Server

Solution of partial differential equations on vector and parallel computers

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

NASA Technical Reports Server

Automated problem scheduling and reduction of synchronization delay effects

Author: Saltz Joel H.
Publication venue
Publication date
Field of study

It is anticipated that in order to make effective use of many future high performance architectures, programs will have to exhibit at least a medium grained parallelism. A framework is presented for partitioning very sparse triangular systems of linear equations that is designed to produce favorable preformance results in a wide variety of parallel architectures. Efficient methods for solving these systems are of interest because: (1) they provide a useful model problem for use in exploring heuristics for the aggregation, mapping and scheduling of relatively fine grained computations whose data dependencies are specified by directed acrylic graphs, and (2) because such efficient methods can find direct application in the development of parallel algorithms for scientific computation. Simple expressions are derived that describe how to schedule computational work with varying degrees of granularity. The Encore Multimax was used as a hardware simulator to investigate the performance effects of using the partitioning techniques presented in shared memory architectures with varying relative synchronization costs

NASA Technical Reports Server

Solving the Ghost-Gluon System of Yang-Mills Theory on GPUs

Author: Aguilar
Alkofer
Alkofer
Alkofer
Atkinson
Boucaud
Cucchieri
Dyson
Fischer
Fischer
Fischer
Fischer
Fister
Glimm
Gribov
Gundolf Haase
Haag
Huber
Kugo
Lerche
Maas
Maas
Maas
Maas
Mandelstam
Maris
Markus Hopfer
Nakanishi
NVIDIA Corporation
NVIDIA Corporation
Osterwalder
Pawlowski
Reinhard Alkofer
Schwinger
Schwinger
Sternbeck
Sternbeck
Takahasi
Taylor
von Smekal
von Smekal
von Smekal
Watson
Zwanziger
Zwanziger
Publication venue: 'Elsevier BV'
Publication date: 18/12/2012
Field of study

We solve the ghost-gluon system of Yang-Mills theory using Graphics Processing Units (GPUs). Working in Landau gauge, we use the Dyson-Schwinger formalism for the mathematical description as this approach is well-suited to directly benefit from the computing power of the GPUs. With the help of a Chebyshev expansion for the dressing functions and a subsequent appliance of a Newton-Raphson method, the non-linear system of coupled integral equations is linearized. The resulting Newton matrix is generated in parallel using OpenMPI and CUDA(TM). Our results show, that it is possible to cut down the run time by two orders of magnitude as compared to a sequential version of the code. This makes the proposed techniques well-suited for Dyson-Schwinger calculations on more complicated systems where the Yang-Mills sector of QCD serves as a starting point. In addition, the computation of Schwinger functions using GPU devices is studied.Comment: 19 pages, 7 figures, additional figure added, dependence on block-size is investigated in more detail, version accepted by CP

arXiv.org e-Print Archive

Crossref

A Many-Core Overlay for High-Performance Embedded Computing on FPGAs

Author: Neto Horácio
Véstias Mário
Publication venue
Publication date: 21/08/2014
Field of study

In this work, we propose a configurable many-core overlay for high-performance embedded computing. The size of internal memory, supported operations and number of ports can be configured independently for each core of the overlay. The overlay was evaluated with matrix multiplication, LU decomposition and Fast-Fourier Transform (FFT) on a ZYNQ-7020 FPGA platform. The results show that using a system-level many-core overlay avoids complex hardware design and still provides good performance results.Comment: Presented at First International Workshop on FPGAs for Software Programmers (FSP 2014) (arXiv:1408.4423

arXiv.org e-Print Archive

Repositório Científico do Instituto Politécnico de Lisboa

Accelerating Industrial Applications: The Development of Basic GPU Kernels for the New Block AMG Algorithms for Solving SLE with Explicitly Calculated Sparse Basis

Author: Afanasyev Ilya
Kharchenko Sergey
Potapov Yury
Sobolev Sergey
Publication venue: The Authors. Published by Elsevier B.V.
Publication date: 31/12/2015
Field of study

AbstractNowadays, GPU computations are playing significant role in supercomputing technologies. This work is a part of a project dealing with solving problems of modeling hydro- and aerodynamics where linear algebra operations are frequently used and occupy most of execution time. In despite of the fact that GPUs are traditionally used for solving high sized problems, in our project we need to solve many tasks of low sizes. Because of this, modern library's solutions such as cuBLAS (1) and cuSPARSE (2) are not suitable enough for that, so we have a task of implementation more efficient functions for concrete linear algebra operations taking into account its specialties

Elsevier - Publisher Connector

Design and analysis of numerical algorithms for the solution of linear systems on parallel and distributed architectures

Author: Rosni Abdullah (7169939)
Publication venue
Publication date: 01/01/1997
Field of study

The increasing availability of parallel computers is having a very significant impact on all aspects of scientific computation, including algorithm research and software development in numerical linear algebra. In particular, the solution of linear systems, which lies at the heart of most calculations in scientific computing is an important computation found in many engineering and scientific applications. In this thesis, well-known parallel algorithms for the solution of linear systems are compared with implicit parallel algorithms or the Quadrant Interlocking (QI) class of algorithms to solve linear systems. These implicit algorithms are (2x2) block algorithms expressed in explicit point form notation. [Continues.

Loughborough University Institutional Repository

Parallel pivoting combined with parallel reduction

Author: Alaghband Gita
Publication venue
Publication date
Field of study

Parallel algorithms for triangularization of large, sparse, and unsymmetric matrices are presented. The method combines the parallel reduction with a new parallel pivoting technique, control over generations of fill-ins and a check for numerical stability, all done in parallel with the work being distributed over the active processes. The parallel technique uses the compatibility relation between pivots to identify parallel pivot candidates and uses the Markowitz number of pivots to minimize fill-in. This technique is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds

NASA Technical Reports Server