Search CORE

17 research outputs found

Numerical reproducibility in HPC: issues in interval arithmetic

Author: Revol Nathalie
Théveny Philippe
Publication venue: HAL CCSD
Publication date: 05/06/2013
Field of study

International audienceThe problem of numerical reproducibility is the problem of getting the same result when a numerical computation is run several times, whether on the same machine or on different machines. The accuracy of the result is a different issue. As far as interval arithmetic is concerned, the relevant issue is the inclusion property, that is, the guarantee that the exact result belongs to the computed resulting interval

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Parallel Implementation of Interval Matrix Multiplication

Author: Revol Nathalie
Théveny Philippe
Publication venue: Springer Verlag
Publication date: 01/01/2013
Field of study

International audienceTwo main and not necessarily compatible objectives when implementing the product of two dense matrices with interval coefficients are accuracy and efficiency. In this work, we focus on an implementation on multicore architectures. One direction successfully explored to gain performance in execution time is the representation of intervals by their midpoints and radii rather than the classical representation by endpoints. Computing with the midpoint-radius representation enables the use of optimized floating-point BLAS and consequently the performances benefit from the performances of the BLAS routines. Several variants of interval matrix multiplication have been proposed, that correspond to various trade-offs between accuracy and efficiency, including some efficient ones proposed by Rump in 2012. However, in order to guarantee that the computed result encloses the exact one, these efficient algorithms rely on an assumption on the order of execution of floating-point operations which is not verified by most implementations of BLAS. In this paper, an algorithm for interval matrix product is proposed that verifies this assumption. Furthermore, several optimizations are proposed and the implementation on a multicore architecture compares reasonably well with a non-guaranteed implementation based on MKL, the optimized BLAS of Intel: the overhead is most of the time less than 2 and never exceeds 3. This implementation also exhibits a good scalability

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Tradeoffs between Accuracy and Efficiency for Optimized and Parallel Interval Matrix Multiplication

Author: Nguyen Hong Diep
Revol Nathalie
Théveny Philippe
Publication venue: HAL CCSD
Publication date: 11/06/2012
Field of study

International audienceInterval arithmetic is mathematically defined as set arithmetic. For implementation issues, it is necessary to detail the representation of intervals and to detail formulas for the arithmetic operations. Two main representations of intervals are considered here: inf-sup and midrad. Formulas for the arithmetic operations, using these representations, are studied along with formulas that trade off accuracy for efficiency. This tradeo ff is particularly blatant on the example of interval matrix multiplication, implemented using floating-point arithmetic: depending on the chosen formulas, the effi ciency as well as the accuracy can vary greatly in practice, and not necessarily as predicted by the theory. Indeed, theoretical predictions are often based on exact operations, as opposed to floating-point operations, and on operations count, as opposed to measured execution time. These observations and the recommendations that ensue are further obfuscated by considerations on memory usage, multithreaded computations. . . when these algorithms are implemented on parallel architectures such as multicores.L'arithmétique par intervalles et une arithmétique sur les ensembles. Pour pouvoir l'implanter, il faut détailler d'une part la représentation choisie pour les intervalles et d'autre part les formules, dépendant de cette représentation, pour les opérations arithmétiques. Essentiellement deux représentations des intervalles sont considérées ici : la représentation inf-sup (par les extrémités) et la représentation mid-rad (par le centre et le rayon). Différentes formules pour les opérations arithmétiques sont présentées, qui offrent différents compromis entre la précision du résultat et la quantité de calculs à effectuer. Ces compromis sont encore plus flagrants quand on considère l'utilisation de ces opérations pour effectuer des produits de matrices à coefficients intervalles, implantés en utilisant l'arithmétique flottante : selon les formules choisies, les performances ainsi que la précision peuvent différer grandement en pratique, mais pas nécessairement comme le prédit la théorie. En effet, les prédictions théoriques sont basées sur l'hypothèse d'une arithmétique sous-jacente qui est exacte et non flottante, ainsi que sur un décompte d'opérations, ce qui ne correspond pas directement aux mesures des temps d'exécution. Ces observations et les recommandations qui en découlent dépendent en outre de considérations sur l'utilisation mémoire, la présence de calculs multithreadés etc. lorsque l'on considère des implantations parallèles sur des architectures telles que des multi-cœurs

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Tradeoffs between Accuracy and Efficiency for Optimized and Parallel Interval Matrix Multiplication

Author: Nguyen Hong Diep
Revol Nathalie
Théveny Philippe
Publication venue: HAL CCSD
Publication date: 11/06/2012
Field of study

INRIA a CCSD electronic archive server

LEMA: Towards a Language for Reliable Arithmetic

Author: de Dinechin Florent
Jeannerod Claude-Pierre
Lefèvre Vincent
Mouilleron Christophe
Pfannholzer David
Revol Nathalie
Théveny Philippe
Publication venue: HAL CCSD
Publication date: 16/04/2010
Field of study

Generating certified and efficient numerical codes requires information ranging from the mathematical level to the representation of numbers. Even though the mathematical semantics can be expressed using the content part of MathML, this language does not encompass the implementation on computers. Indeed various arithmetics may be involved, like floating-point or fixed-point, in fixed precision or arbitrary precision, and current tools cannot handle all of these. Therefore we propose in this paper LEMA (Langage pour les Expressions Mathématiques Annotées), a descriptive language based on MathML with additional expressiveness. LEMA will be used during the automatic generation of certified numerical codes. Such a generation process typically involves several steps, and LEMA would thus act as a glue to represent and store the information at every stage. First, we specify in the language the characteristics of the arithmetic as described in the IEEE 754 floating-point standard: formats, exceptions, rounding modes. This can be generalized to other arithmetics. Then, we use annotations to attach a specific arithmetic context to an expression tree. Finally, considering the evaluation of the expression in this context allows us to deduce several properties on the result, like being exact or being an exception. Other useful properties include numerical ranges and error bounds

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Divers algorithmes de produits de matrices intervalles

Author: Théveny Philippe
Publication venue: HAL CCSD
Publication date: 20/06/2012
Field of study

National audienceLe produit de matrices à coefficients intervalles est significativement plus lent que le produit de matrices à coefficients numériques, notamment à cause des changements nécessaires du mode d'arrondi. En réordonnant les opérations de l'algorithme naïf et en utilisant une représentation des intervalles par leur centre et leur rayon, il est possible de limiter le nombre de changements du mode d'arrondi et de se ramener à des appels à des fonctions BLAS de niveau 3. Plusieurs algorithmes de multiplications de matrices à coefficients intervalles de ce type existent dans la littérature, certains améliorant le temps d'exécution au détriment de la précision du résultat. Nous présentons ici une sélection de tels algorithmes et un ensemble de mesures expérimentales de leur précision. A partir de ces expériences numériques, nous analysons l'erreur mesurée qui est souvent très inférieure à la meilleure borne théorique

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Interval matrix multiplication on parallel architectures

Author: Revol Nathalie
Théveny Philippe
Publication venue: HAL CCSD
Publication date: 24/09/2012
Field of study

International audienceGetting efficiency when implementing interval arithmetic computations is a difficult task. The work presented here deals with the efficient implementation of interval matrix multiplication on parallel architectures. A first issue is the choice of the formulas. The main principle we adopted consists in resorting, as often as possible, to optimized routines such as the BLAS3, as implemented in Intel's MKL for instance. To do so, the formulas chosen to implement interval arithmetic operations are based on the representation of intervals by their midpoint and radius. This approach has been advocated by S. Rump in 1999 and used in particular in his implementation IntLab. It is recalled that a panel of formulas for operations using the midpoint-radius representation exists: exact formulas can be found in A. Neumaier's book: "Interval Methods for Systems of Equations" (1990), in his paper in 1999 S. Rump gave approximate formulas with less operations, H.D. Nguyen in his Ph.D. thesis in 2011 gave a choice of formulas reaching various tradeoffs in terms of operation count and accuracy. These formulas for the addition and multiplication of two intervals are used in the classical formulas for matrix multiplication and can be expressed as operations (addition and multiplication) of matrices of real numbers (either midpoints or radii), S. Rump recapitulates some such matrix expressions in a recent paper (2012). In this presentation, the merits of each approach are discussed, in terms of number of elementary operations, use of BLAS3 routines for the matrix multiplication, and of accuracy. The comparison of the relative accuracies are based on the assumption that arithmetic operations are implemented using exact arithmetic. We also give a comparison of these accuracies, assuming that arithmetic operations are implemented using floating-point arithmetic. A second issue concerns the adaptation to the architecture. Indeed, the architectures targetted in this study are parallel architectures such as multicores or GPU. When implemented on such architectures, some measures such as the arithmetic operations count are no more relevant: the measured execution times do not relate directly to the operations count. This is explained by considerations on memory usage, multithreaded computations... We will show some experiments that take these architectural parameters into account and reach good performances. We will give some tradeoffs between the memory consumption and memory traffic: it can for instance be beneficial to copy (parts of) the involved matrices in the right caches to avoid cache misses and heavy traffic

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Numerical reproducibility in HPC: the interval point of view

Author: Revol Nathalie
Théveny Philippe
Publication venue: HAL CCSD
Publication date: 08/09/2013
Field of study

International audienceWhat is called numerical reproducibility is the problem of getting the same result, when the scientific computation is run several times, either on the same machine (and this is called repeatability) or on different machines, with different numbers of processing units, types, execution environments, computational loads etc. This problem is especially stringent for HPC results. For interval computations, numerical reproducibility is of course an issue for testing and debugging purposes. However, as long as the computed result encloses the exact and unknown result, the inclusion property, which is the main property of interval arithmetic, is satisfied and getting bit for bit identical results may not be crucial. However, implementation issues may invalidate the inclusion property, in particular if the rounding modes set by the user are modified during the execution. We will present several ways to circumvent these issues, on the example of the product of matrices with interval coefficients

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Numerical reproducibility in HPC: the interval point of view

Author: Revol Nathalie
Théveny Philippe
Publication venue: HAL CCSD
Publication date: 08/09/2013
Field of study

Hal-Diderot