262,884 research outputs found
Time-parallel iterative solvers for parabolic evolution equations
We present original time-parallel algorithms for the solution of the implicit
Euler discretization of general linear parabolic evolution equations with
time-dependent self-adjoint spatial operators. Motivated by the inf-sup theory
of parabolic problems, we show that the standard nonsymmetric time-global
system can be equivalently reformulated as an original symmetric saddle-point
system that remains inf-sup stable with respect to the same natural parabolic
norms. We then propose and analyse an efficient and readily implementable
parallel-in-time preconditioner to be used with an inexact Uzawa method. The
proposed preconditioner is non-intrusive and easy to implement in practice, and
also features the key theoretical advantages of robust spectral bounds, leading
to convergence rates that are independent of the number of time-steps, final
time, or spatial mesh sizes, and also a theoretical parallel complexity that
grows only logarithmically with respect to the number of time-steps. Numerical
experiments with large-scale parallel computations show the effectiveness of
the method, along with its good weak and strong scaling properties
Parallel alogorithms for MIMD parallel computers
This thesis mainly covers the design and analysis of asynchronous
parallel algorithms that can be run on MIMD (Multiple Instruction
Multiple Data) parallel computers, in particular the NEPTUNE system at
Loughborough University. Initially the fundamentals of parallel computer
architectures are introduced with different parallel architectures being
described and compared. The principles of parallel programming and the
design of parallel algorithms are also outlined. Also the main
characteristics of the 4 processor MIMD NEPTUNE system are presented,
and performance indicators, i.e. the speed-up and the efficiency factors
are defined for the measurement of parallelism in a given system.
Both numerical and non-numerical algorithms are covered in the
thesis. In the numerical solution of partial differential equations,
a new parallel 9-point block iterative method is developed. Here, the
organization of the blocks is done in such a way that each process
contains its own group of 9 points on the network, therefore, they can
be run in parallel. The parallel implementation of both 9-point and 4-
point block iterative methods were programmed using natural and redblack
ordering with synchronous and asynchronous approaches. The
results obtained for these different implementations were compared and
analysed.
Next the parallel version of the A.G.E. (Alternating Group Explicit)
method is developed in which the explicit nature of the difference
equation is revealed and exploited when applied to derive the solution
of both linear and non-linear 2-point boundary value problems. Two
strategies have been used in the implementation of the parallel A.G.E.
method using the synchronous and asynchronous approaches. The results
from these implementations were compared. Also for comparison reasons
the results obtained from the parallel A.G.E. were compared with the ~
corresponding results obtained from the parallel versions of the Jacobi,
Gauss-Seidel and S.O.R. methods. Finally, a computational complexity
analysis of the parallel A.G.E. algorithms is included.
In the area of non-numeric algorithms, the problems of sorting and
searching were studied. The sorting methods which were investigated
was the shell and the digit sort methods. with each method different
parallel strategies and approaches were used and compared to find the
best results which can be obtained on the parallel machine.
In the searching methods, the sequential search algorithm in an
unordered table and the binary search algorithms were investigated and
implemented in parallel with a presentation of the results. Finally,
a complexity analysis of these methods is presented.
The thesis concludes with a chapter summarizing the main results
Towards parallelizable sampling-based Nonlinear Model Predictive Control
This paper proposes a new sampling-based nonlinear model predictive control
(MPC) algorithm, with a bound on complexity quadratic in the prediction horizon
N and linear in the number of samples. The idea of the proposed algorithm is to
use the sequence of predicted inputs from the previous time step as a warm
start, and to iteratively update this sequence by changing its elements one by
one, starting from the last predicted input and ending with the first predicted
input. This strategy, which resembles the dynamic programming principle, allows
for parallelization up to a certain level and yields a suboptimal nonlinear MPC
algorithm with guaranteed recursive feasibility, stability and improved cost
function at every iteration, which is suitable for real-time implementation.
The complexity of the algorithm per each time step in the prediction horizon
depends only on the horizon, the number of samples and parallel threads, and it
is independent of the measured system state. Comparisons with the fmincon
nonlinear optimization solver on benchmark examples indicate that as the
simulation time progresses, the proposed algorithm converges rapidly to the
"optimal" solution, even when using a small number of samples.Comment: 9 pages, 9 pictures, submitted to IFAC World Congress 201
Design of Introspective Circuits for Analysis of Cell-Level Dis-orientation in Self-Assembled Cellular Systems
This paper discusses a novel approach to managing complexity in a large self-assembled system, by utilizing the self-assembling components themselves to address the complexity. A particular challenge is discussed – namely the question of how to deal with elements that are assembled in different orientations from each other – and a solution based on the idea ofintrospective circuitry is described. A methodology for using a set of cells to determine a nearby cell’s orientation is given, leading to a slow (O(n)) means of orienting a 2D region of cells. A modified algorithm is then describe to allow parallel analysis of/adaption to dis-oriented cells, thus allowing re-orientation of an entire 2D region of cells with better-than-linear time performance (O(sqrt(n))). The significance of this work is discussed not only in terms of managing arrays of dis-oriented cells but also more importantly as an example of the usefulness of local, distributed self-configuration to create and use introspective circuitry
Parallel computation of echelon forms
International audienceWe propose efficient parallel algorithms and implementations on shared memory architectures of LU factorization over a finite field. Compared to the corresponding numerical routines, we have identified three main difficulties specific to linear algebra over finite fields. First, the arithmetic complexity could be dominated by modular reductions. Therefore, it is mandatory to delay as much as possible these reductions while mixing fine-grain parallelizations of tiled iterative and recursive algorithms. Second, fast linear algebra variants, e.g., using Strassen-Winograd algorithm, never suffer from instability and can thus be widely used in cascade with the classical algorithms. There, trade-offs are to be made between size of blocks well suited to those fast variants or to load and communication balancing. Third, many applications over finite fields require the rank profile of the matrix (quite often rank deficient) rather than the solution to a linear system. It is thus important to design parallel algorithms that preserve and compute this rank profile. Moreover, as the rank profile is only discovered during the algorithm, block size has then to be dynamic. We propose and compare several block decomposition: tile iterative with left-looking, right-looking and Crout variants, slab and tile recursive. Experiments demonstrate that the tile recursive variant performs better and matches the performance of reference numerical software when no rank deficiency occur. Furthermore, even in the most heterogeneous case, namely when all pivot blocks are rank deficient, we show that it is possbile to maintain a high efficiency
- …