4 research outputs found

    Failure Mitigation in Linear, Sesquilinear and Bijective Operations On Integer Data Streams Via Numerical Entanglement

    Full text link
    A new roll-forward technique is proposed that recovers from any single fail-stop failure in MM integer data streams (M3M\geq3) when undergoing linear, sesquilinear or bijective (LSB) operations, such as: scaling, additions/subtractions, inner or outer vector products and permutations. In the proposed approach, the MM input integer data streams are linearly superimposed to form MM numerically entangled integer data streams that are stored in-place of the original inputs. A series of LSB operations can then be performed directly using these entangled data streams. The output results can be extracted from any M1M-1 entangled output streams by additions and arithmetic shifts, thereby guaranteeing robustness to a fail-stop failure in any single stream computation. Importantly, unlike other methods, the number of operations required for the entanglement, extraction and recovery of the results is linearly related to the number of the inputs and does not depend on the complexity of the performed LSB operations. We have validated our proposal in an Intel processor (Haswell architecture with AVX2 support) via convolution operations. Our analysis and experiments reveal that the proposed approach incurs only 1.8%1.8\% to 2.8%2.8\% reduction in processing throughput in comparison to the failure-intolerant approach. This overhead is 9 to 14 times smaller than that of the equivalent checksum-based method. Thus, our proposal can be used in distributed systems and unreliable processor hardware, or safety-critical applications, where robustness against fail-stop failures becomes a necessity.Comment: Proc. 21st IEEE International On-Line Testing Symposium (IOLTS 2015), July 2015, Halkidiki, Greec

    Binomial American Option Pricing on CPU-GPU Hetergenous System

    Get PDF
    Abstract-We present a novel parallel binomial algorithm to compute prices of American options. The algorithm partitions a binomial tree into blocks of multiple levels of nodes, and assigns each such block to multiple processors. Each processor in parallel with the others computes the option's values at nodes assigned to it. The computation consists of two phases, where the second phase can not start until the valuation in the first phase has been completed. The algorithm is implemented and tested on a heterogeneous system consisting of an Intel multicore processor and a NVIDIA GPU. The whole task is split and divided over the CPU and GPU so that the computations are performed on the two processors simultaneously. In the hybrid processing, the GPU is always assigned the last part of a block, and makes use of a couple of buffers in the on-chip shared memory to reduce the number of accesses to the off-chip device memory. The performance of the hybrid processing is compared with an optimised CPU serial code, a CPU parallel implementation and a GPU standalone program. We learned from the experiments that the lack of explicit mechanism in CUDA for synchronising CPU and GPU executions is a major obstacle for the hybrid processing to achieve high performance

    Generalized Numerical Entanglement For Reliable Linear, Sesquilinear And Bijective Operations On Integer Data Streams

    Get PDF
    We propose a new technique for the mitigation of fail-stop failures and/or silent data corruptions (SDCs) within linear, sesquilinear or bijective (LSB) operations on M integer data streams (M ⩾ 3). In the proposed approach, the M input streams are linearly superimposed to form M numerically entangled integer data streams that are stored in-place of the original inputs, i.e., no additional (aka. “checksum”) streams are used. An arbitrary number of LSB operations can then be performed in M processing cores using these entangled data streams. The output results can be extracted from any (M-K) entangled output streams by additions and arithmetic shifts, thereby mitigating K fail-stop failures (K ≤ ⌊(M-1)/2 ⌋ ), or detecting up to K SDCs per M-tuple of outputs at corresponding in-stream locations. Therefore, unlike other methods, the number of operations required for the entanglement, extraction and recovery of the results is linearly related to the number of the inputs and does not depend on the complexity of the performed LSB operations. Our proposal is validated within an Amazon EC2 instance (Haswell architecture with AVX2 support) via integer matrix product operations. Our analysis and experiments for failstop failure mitigation and SDC detection reveal that the proposed approach incurs 0.75% to 37.23% reduction in processing throughput in comparison to the equivalent errorintolerant processing. This overhead is found to be up to two orders of magnitude smaller than that of the equivalent checksum-based method, with increased gains offered as the complexity of the performed LSB operations is increasing. Therefore, our proposal can be used in distributed systems, unreliable multicore clusters and safety-critical applications, where robustness against failures and SDCs is a necessity
    corecore