16 research outputs found
Resilience in Numerical Methods: A Position on Fault Models and Methodologies
Future extreme-scale computer systems may expose silent data corruption (SDC)
to applications, in order to save energy or increase performance. However,
resilience research struggles to come up with useful abstract programming
models for reasoning about SDC. Existing work randomly flips bits in running
applications, but this only shows average-case behavior for a low-level,
artificial hardware model. Algorithm developers need to understand worst-case
behavior with the higher-level data types they actually use, in order to make
their algorithms more resilient. Also, we know so little about how SDC may
manifest in future hardware, that it seems premature to draw conclusions about
the average case. We argue instead that numerical algorithms can benefit from a
numerical unreliability fault model, where faults manifest as unbounded
perturbations to floating-point data. Algorithms can use inexpensive "sanity"
checks that bound or exclude error in the results of computations. Given a
selective reliability programming model that requires reliability only when and
where needed, such checks can make algorithms reliable despite unbounded
faults. Sanity checks, and in general a healthy skepticism about the
correctness of subroutines, are wise even if hardware is perfectly reliable.Comment: Position Pape
Solving and Certifying the Solution of a Linear System
The Reliable Computing journal has no more paper publication, only free, electronic publication.International audienceUsing floating-point arithmetic to solve a linear system yields a computed result, which is an approximation of the exact solution because of roundoff errors. In this paper, we present an approach to certify the computed solution. Here, "certify" means computing a guaranteed enclosure of the error. Our method is an iterative refinement method and thus it also improves the computed result. The method we present is inspired from the verifylss function of the IntLab library, with a first step, using floating-point arithmetic, to solve the linear system, followed by interval computations to get and refine an enclosure of the error. The specificity of our method is to relax the requirement of tightness of the error, in order to gain in performance. Indeed, only the order of magnitude of the error is needed. Experiments show a gain in accuracy and in performance, for various condition number of the matrix of the linear system
Mixed precision GMRES-based iterative refinement with recycling
summary:With the emergence of mixed precision hardware, mixed precision GMRES-based iterative refinement schemes for solving linear systems have recently been developed. However, in certain settings, GMRES may require too many iterations per refinement step, making it potentially more expensive than the alternative of recomputing the LU factors in a higher precision. In this work, we incorporate the idea of Krylov subspace recycling, a well-known technique for reusing information across sequential invocations, of a Krylov subspace method into a mixed precision GMRES-based iterative refinement solver. The insight is that in each refinement step, we call preconditioned GMRES on a linear system with the same coefficient matrix . In this way, the GMRES solves in subsequent refinement steps can be accelerated by recycling information obtained from previous steps. We perform numerical experiments on various random dense problems, Toeplitz problems, and problems from real applications, which confirm the benefits of the recycling approach
Superfast Refinement of Low Rank Approximation of a Matrix
Low rank approximation (LRA) of a matrix is a hot subject of modern
computations. In application to Big Data mining and analysis the input matrices
are usually so immense that one must apply superfast algorithms, which only
access a tiny fraction of the input entries and involve much fewer memory cells
and flops than an input matrix has entries. Recently we devised and analyzed
some superfast LRA algorithms; in this paper we extend a classical algorithm of
iterative refinement of the solution of linear systems of equations to
superfast refinement of a crude but reasonably close LRA; we also list some
heuristic recipes for superfast a posteriori estimation of the errors of LRA
and support our superfast refinement algorithm with some superfast heuristic
recipes for a posteriori error estimation of LRA and with superfast back and
forth transition between any LRA of a matrix and its SVD. Our algorithm of
iterative refinement of LRA is the first attempt of this kind and should
motivate further effort in that direction, but already our initial tests are in
good accordance with our formal study.Comment: 12.5 pages,, 1 table and 1 figur
Recommended from our members
Final Report for UC Berkeley Terascale Optimal PDE Solvers TOPS DOE Award Number DE-FC02-01ER25478 9/15/2001 – 9/14/2006
In many areas of science, physical experimentation may be too dangerous, too expensive or even impossible. Instead, large-scale simulations, validated by comparison with related experiments in well-understood laboratory contexts, are used by scientists to gain insight and confirmation of existing theories in such areas, without benefit of full experimental verification. The goal of the TOPS ISIC was to develop and implement algorithms and support scientific investigations performed by DOE-sponsored researchers. A major component of this effort is to provide software for large scale parallel computers capable of efficiently solving the enormous systems of equations arising from the nonlinear PDEs underlying these simulations. Several TOPS supported packages where designed in part (ScaLAPACK) or in whole (SuperLU) at Berkeley, and are widely used beyond SciDAC and DOE. Beyond continuing to develop these codes, our main effort focused on automatic performance tuning of the sparse matrix kernels (eg sparse-matrix-vector-multiply, or SpMV) at the core of many TOPS iterative solvers. Based on the observation that the fastest implementation of SpMV (and other kernels) can depend dramatically both on the computer and the matrix (the latter of which is not known until run-time), we developed and released a system called OSKI (Optimized Sparse Kernel Interface) that will automatically produce optimized version of SpMV (and other kernels), hiding complicated implementation details from the user. OSKI led to a 2x speedup in SpMV in a DOE accelerator design code, a 2x speedup in a commercial lithography simulation, and has been downloaded over 500 times. In addition to a stand-alone version, OSKI was also integrated into the TOPS-supported PETSc system
Factors Affecting Customer Satisfaction in Selecting Transport Network Vehicle Service (TNVS) in the Philippines
Transport Network Vehicle Services (TNVS) like Grab and Uber have revolutionized public transportation in the Philippines, particularly in densely populated areas like Metro Manila, Cebu, and Davao. These platforms offer a convenient alternative to traditional public transport, allowing users to travel easily. Researchers have employed Structural Equation Modeling (SEM) to uncover the factors influencing customer satisfaction when choosing TNVS. They analyzed variables such as service quality, physical aspects, variability, responsiveness, and empathy, measuring them with relevant indicators. This research provides valuable insights for TNVS service providers and policymakers to enhance their services by addressing areas needing improvement. Ultimately, these findings contribute to the advancement of the TNVS industry, benefiting both providers and consumers
AIR: Adaptive Dynamic Precision Iterative Refinement
In high performance computing, applications often require very accurate solutions while minimizing runtimes and power consumption. Improving the ratio of the number of logic gates implementing floating point arithmetic operations to the total number of logic gates enables greater efficiency, potentially with higher performance and lower power consumption. Software executing on the fixed hardware in Von-Neuman architectures faces limitations on improving this ratio, since processors require extensive supporting logic to fetch and decode instructions while employing arithmetic units with statically defined precision. This dissertation explores novel approaches to improve computing architectures for linear system applications not only by designing application-specific hardware but also by optimizing precision by applying adaptive dynamic precision iterative refinement (AIR). This dissertation shows that AIR is numerically stable and well behaved. Theoretically, AIR can produce up to 3 times speedup over mixed precision iterative refinement on FPGAs. Implementing an AIR prototype for the refinement procedure on a Xilinx XC6VSX475T FPGA results in an estimated around 0.5, 8, and 2 times improvement for the time-, clock-, and energy-based performance per iteration compared to mixed precision iterative refinement on the Nvidia Tesla C2075 GPU, when a user requires a prescribed accuracy between single and double precision. AIR using FPGAs can produce beyond double precision accuracy effectively, while CPUs or GPUs need software help causing substantial overhead