3 research outputs found
Parallel computation of echelon forms
International audienceWe propose efficient parallel algorithms and implementations on shared memory architectures of LU factorization over a finite field. Compared to the corresponding numerical routines, we have identified three main difficulties specific to linear algebra over finite fields. First, the arithmetic complexity could be dominated by modular reductions. Therefore, it is mandatory to delay as much as possible these reductions while mixing fine-grain parallelizations of tiled iterative and recursive algorithms. Second, fast linear algebra variants, e.g., using Strassen-Winograd algorithm, never suffer from instability and can thus be widely used in cascade with the classical algorithms. There, trade-offs are to be made between size of blocks well suited to those fast variants or to load and communication balancing. Third, many applications over finite fields require the rank profile of the matrix (quite often rank deficient) rather than the solution to a linear system. It is thus important to design parallel algorithms that preserve and compute this rank profile. Moreover, as the rank profile is only discovered during the algorithm, block size has then to be dynamic. We propose and compare several block decomposition: tile iterative with left-looking, right-looking and Crout variants, slab and tile recursive. Experiments demonstrate that the tile recursive variant performs better and matches the performance of reference numerical software when no rank deficiency occur. Furthermore, even in the most heterogeneous case, namely when all pivot blocks are rank deficient, we show that it is possbile to maintain a high efficiency
Elements of Design for Containers and Solutions in the LinBox Library
We describe in this paper new design techniques used in the \cpp exact linear
algebra library \linbox, intended to make the library safer and easier to use,
while keeping it generic and efficient. First, we review the new simplified
structure for containers, based on our \emph{founding scope allocation} model.
We explain design choices and their impact on coding: unification of our matrix
classes, clearer model for matrices and submatrices, \etc Then we present a
variation of the \emph{strategy} design pattern that is comprised of a
controller--plugin system: the controller (solution) chooses among plug-ins
(algorithms) that always call back the controllers for subtasks. We give
examples using the solution \mul. Finally we present a benchmark architecture
that serves two purposes: Providing the user with easier ways to produce
graphs; Creating a framework for automatically tuning the library and
supporting regression testing.Comment: 8 pages, 4th International Congress on Mathematical Software, Seoul :
Korea, Republic Of (2014
Computing the Rank Profile Matrix
The row (resp. column) rank profile of a matrix describes the staircase shape
of its row (resp. column) echelon form. In an ISSAC'13 paper, we proposed a
recursive Gaussian elimination that can compute simultaneously the row and
column rank profiles of a matrix as well as those of all of its leading
sub-matrices, in the same time as state of the art Gaussian elimination
algorithms. Here we first study the conditions making a Gaus-sian elimination
algorithm reveal this information. Therefore, we propose the definition of a
new matrix invariant, the rank profile matrix, summarizing all information on
the row and column rank profiles of all the leading sub-matrices. We also
explore the conditions for a Gaussian elimination algorithm to compute all or
part of this invariant, through the corresponding PLUQ decomposition. As a
consequence, we show that the classical iterative CUP decomposition algorithm
can actually be adapted to compute the rank profile matrix. Used, in a Crout
variant, as a base-case to our ISSAC'13 implementation, it delivers a
significant improvement in efficiency. Second, the row (resp. column) echelon
form of a matrix are usually computed via different dedicated triangular
decompositions. We show here that, from some PLUQ decompositions, it is
possible to recover the row and column echelon forms of a matrix and of any of
its leading sub-matrices thanks to an elementary post-processing algorithm