12,119 research outputs found
A Mixed Real and Floating-Point Solver
Reasoning about mixed real and floating-point constraints is essential for developing accurate analysis tools for floating-point pro- grams. This paper presents FPRoCK, a prototype tool for solving mixed real and floating-point formulas. FPRoCK transforms a mixed formula into an equisatisfiable one over the reals. This formula is then solved using an off-the-shelf SMT solver. FPRoCK is also integrated with the PRECiSA static analyzer, which computes a sound estimation of the round-off error of a floating-point program. It is used to detect infeasible computational paths, thereby improving the accuracy of PRECiSA
Stochastic rounding and reduced-precision fixed-point arithmetic for solving neural ordinary differential equations
Although double-precision floating-point arithmetic currently dominates
high-performance computing, there is increasing interest in smaller and simpler
arithmetic types. The main reasons are potential improvements in energy
efficiency and memory footprint and bandwidth. However, simply switching to
lower-precision types typically results in increased numerical errors. We
investigate approaches to improving the accuracy of reduced-precision
fixed-point arithmetic types, using examples in an important domain for
numerical computation in neuroscience: the solution of Ordinary Differential
Equations (ODEs). The Izhikevich neuron model is used to demonstrate that
rounding has an important role in producing accurate spike timings from
explicit ODE solution algorithms. In particular, fixed-point arithmetic with
stochastic rounding consistently results in smaller errors compared to single
precision floating-point and fixed-point arithmetic with round-to-nearest
across a range of neuron behaviours and ODE solvers. A computationally much
cheaper alternative is also investigated, inspired by the concept of dither
that is a widely understood mechanism for providing resolution below the least
significant bit (LSB) in digital signal processing. These results will have
implications for the solution of ODEs in other subject areas, and should also
be directly relevant to the huge range of practical problems that are
represented by Partial Differential Equations (PDEs).Comment: Submitted to Philosophical Transactions of the Royal Society
Improved Accuracy and Parallelism for MRRR-based Eigensolvers -- A Mixed Precision Approach
The real symmetric tridiagonal eigenproblem is of outstanding importance in
numerical computations; it arises frequently as part of eigensolvers for
standard and generalized dense Hermitian eigenproblems that are based on a
reduction to tridiagonal form. For its solution, the algorithm of Multiple
Relatively Robust Representations (MRRR) is among the fastest methods. Although
fast, the solvers based on MRRR do not deliver the same accuracy as competing
methods like Divide & Conquer or the QR algorithm. In this paper, we
demonstrate that the use of mixed precisions leads to improved accuracy of
MRRR-based eigensolvers with limited or no performance penalty. As a result, we
obtain eigensolvers that are not only equally or more accurate than the best
available methods, but also -in most circumstances- faster and more scalable
than the competition
Solving Lattice QCD systems of equations using mixed precision solvers on GPUs
Modern graphics hardware is designed for highly parallel numerical tasks and
promises significant cost and performance benefits for many scientific
applications. One such application is lattice quantum chromodyamics (lattice
QCD), where the main computational challenge is to efficiently solve the
discretized Dirac equation in the presence of an SU(3) gauge field. Using
NVIDIA's CUDA platform we have implemented a Wilson-Dirac sparse matrix-vector
product that performs at up to 40 Gflops, 135 Gflops and 212 Gflops for double,
single and half precision respectively on NVIDIA's GeForce GTX 280 GPU. We have
developed a new mixed precision approach for Krylov solvers using reliable
updates which allows for full double precision accuracy while using only single
or half precision arithmetic for the bulk of the computation. The resulting
BiCGstab and CG solvers run in excess of 100 Gflops and, in terms of iterations
until convergence, perform better than the usual defect-correction approach for
mixed precision.Comment: 30 pages, 7 figure
Multi-mass solvers for lattice QCD on GPUs
Graphical Processing Units (GPUs) are more and more frequently used for
lattice QCD calculations. Lattice studies often require computing the quark
propagators for several masses. These systems can be solved using multi-shift
inverters but these algorithms are memory intensive which limits the size of
the problem that can be solved using GPUs. In this paper, we show how to
efficiently use a memory-lean single-mass inverter to solve multi-mass
problems. We focus on the BiCGstab algorithm for Wilson fermions and show that
the single-mass inverter not only requires less memory but also outperforms the
multi-shift variant by a factor of two.Comment: 27 pages, 6 figures, 3 Table
On Sound Relative Error Bounds for Floating-Point Arithmetic
State-of-the-art static analysis tools for verifying finite-precision code
compute worst-case absolute error bounds on numerical errors. These are,
however, often not a good estimate of accuracy as they do not take into account
the magnitude of the computed values. Relative errors, which compute errors
relative to the value's magnitude, are thus preferable. While today's tools do
report relative error bounds, these are merely computed via absolute errors and
thus not necessarily tight or more informative. Furthermore, whenever the
computed value is close to zero on part of the domain, the tools do not report
any relative error estimate at all. Surprisingly, the quality of relative error
bounds computed by today's tools has not been systematically studied or
reported to date. In this paper, we investigate how state-of-the-art static
techniques for computing sound absolute error bounds can be used, extended and
combined for the computation of relative errors. Our experiments on a standard
benchmark set show that computing relative errors directly, as opposed to via
absolute errors, is often beneficial and can provide error estimates up to six
orders of magnitude tighter, i.e. more accurate. We also show that interval
subdivision, another commonly used technique to reduce over-approximations, has
less benefit when computing relative errors directly, but it can help to
alleviate the effects of the inherent issue of relative error estimates close
to zero
A Survey of Satisfiability Modulo Theory
Satisfiability modulo theory (SMT) consists in testing the satisfiability of
first-order formulas over linear integer or real arithmetic, or other theories.
In this survey, we explain the combination of propositional satisfiability and
decision procedures for conjunctions known as DPLL(T), and the alternative
"natural domain" approaches. We also cover quantifiers, Craig interpolants,
polynomial arithmetic, and how SMT solvers are used in automated software
analysis.Comment: Computer Algebra in Scientific Computing, Sep 2016, Bucharest,
Romania. 201
Parallel Algorithm for Solving Kepler's Equation on Graphics Processing Units: Application to Analysis of Doppler Exoplanet Searches
[Abridged] We present the results of a highly parallel Kepler equation solver
using the Graphics Processing Unit (GPU) on a commercial nVidia GeForce 280GTX
and the "Compute Unified Device Architecture" programming environment. We apply
this to evaluate a goodness-of-fit statistic (e.g., chi^2) for Doppler
observations of stars potentially harboring multiple planetary companions
(assuming negligible planet-planet interactions). We tested multiple
implementations using single precision, double precision, pairs of single
precision, and mixed precision arithmetic. We find that the vast majority of
computations can be performed using single precision arithmetic, with selective
use of compensated summation for increased precision. However, standard single
precision is not adequate for calculating the mean anomaly from the time of
observation and orbital period when evaluating the goodness-of-fit for real
planetary systems and observational data sets. Using all double precision, our
GPU code outperforms a similar code using a modern CPU by a factor of over 60.
Using mixed-precision, our GPU code provides a speed-up factor of over 600,
when evaluating N_sys > 1024 models planetary systems each containing N_pl = 4
planets and assuming N_obs = 256 observations of each system. We conclude that
modern GPUs also offer a powerful tool for repeatedly evaluating Kepler's
equation and a goodness-of-fit statistic for orbital models when presented with
a large parameter space.Comment: 19 pages, to appear in New Astronom
- …