4,308 research outputs found

    Exploiting Data Representation for Fault Tolerance

    Full text link
    We explore the link between data representation and soft errors in dot products. We present an analytic model for the absolute error introduced should a soft error corrupt a bit in an IEEE-754 floating-point number. We show how this finding relates to the fundamental linear algebra concepts of normalization and matrix equilibration. We present a case study illustrating that the probability of experiencing a large error in a dot product is minimized when both vectors are normalized. Furthermore, when data is normalized we show that the absolute error is less than one or very large, which allows us to detect large errors. We demonstrate how this finding can be used by instrumenting the GMRES iterative solver. We count all possible errors that can be introduced through faults in arithmetic in the computationally intensive orthogonalization phase, and show that when scaling is used the absolute error can be bounded above by one

    Evaluating the Impact of SDC on the GMRES Iterative Solver

    Full text link
    Increasing parallelism and transistor density, along with increasingly tighter energy and peak power constraints, may force exposure of occasionally incorrect computation or storage to application codes. Silent data corruption (SDC) will likely be infrequent, yet one SDC suffices to make numerical algorithms like iterative linear solvers cease progress towards the correct answer. Thus, we focus on resilience of the iterative linear solver GMRES to a single transient SDC. We derive inexpensive checks to detect the effects of an SDC in GMRES that work for a more general SDC model than presuming a bit flip. Our experiments show that when GMRES is used as the inner solver of an inner-outer iteration, it can "run through" SDC of almost any magnitude in the computationally intensive orthogonalization phase. That is, it gets the right answer using faulty data without any required roll back. Those SDCs which it cannot run through, get caught by our detection scheme

    Resilience in Numerical Methods: A Position on Fault Models and Methodologies

    Full text link
    Future extreme-scale computer systems may expose silent data corruption (SDC) to applications, in order to save energy or increase performance. However, resilience research struggles to come up with useful abstract programming models for reasoning about SDC. Existing work randomly flips bits in running applications, but this only shows average-case behavior for a low-level, artificial hardware model. Algorithm developers need to understand worst-case behavior with the higher-level data types they actually use, in order to make their algorithms more resilient. Also, we know so little about how SDC may manifest in future hardware, that it seems premature to draw conclusions about the average case. We argue instead that numerical algorithms can benefit from a numerical unreliability fault model, where faults manifest as unbounded perturbations to floating-point data. Algorithms can use inexpensive "sanity" checks that bound or exclude error in the results of computations. Given a selective reliability programming model that requires reliability only when and where needed, such checks can make algorithms reliable despite unbounded faults. Sanity checks, and in general a healthy skepticism about the correctness of subroutines, are wise even if hardware is perfectly reliable.Comment: Position Pape

    Robust stabilization by linear output delay feedback

    Get PDF
    The main result establishes that if a controller CC (comprising of a linear feedback of the output and its \emph{derivatives}) globally stabilizes a (nonlinear) plant PP, then global stabilization of PP can also be achieved by an output feedback controller C[h]C[h] where the output derivatives in CC are replaced by an Euler approximation with sufficiently small delay h>0. This is proved within the conceptual framework of the nonlinear gap metric approach to robust stability. The main result is then applied to finite dimensional linear minimum phase systems with unknown coefficients but known relative degree and known sign of the high frequency gain. Results are also given for systems with non-zero initial conditions

    Cost-benefit analysis of Australian Federal Police counter-terrorism operations at Australian airports

    Get PDF
    The terrorist attacks of 11 September 2001 highlighted the vulnerabilities of airports and aircraft. Further attacks in 2002, 2007 and 2009, have led to major government reforms in passenger processing and airport access. The security of Australian airports has also followed this trend, with an increased police presence. However, limited consideration has been given to the costs of these measures, compared to benefit. This Working Paper identifies the factors to be considered in such cost-benefit analyses and the authors outline their preliminary findings. The scope for further research is highlighted, particularly in relation to risk analysis and cost
    corecore