23 research outputs found

    Engineering holistic fault tolerance

    Get PDF
    PhD ThesisFault-tolerant software should be engineered to be maintainable as well as efficient with regards to performance and resources. These characteristics should be evaluated before deployment of the software. However, the main focus is very often made on the functional features of the application, whereas fault tolerance mechanisms are neglected. As a result, they are often neither maintainable nor efficient. The concept of Holistic Fault Tolerance was introduced to deal with these issues. It is a novel crosscutting approach to the design and implementation of fault tolerance mechanisms for developing reliable software applications that meet non-functional requirements, such as performance and resource utilisation. The thesis starts with the description of problems that were motivating for the idea of Holistic Fault Tolerance. These problems are related to resource utilisation requirements of modern computer-based systems, since more resources like hardware components and energy are required to process modern computational tasks and ensure performance and reliability of the computation. Moreover, the complexity of these systems grows, leading to maintainability deterioration, especially of those system parts, which are responsible for satisfying non-functional requirements, such as reliability, performance and resource usage. After analysis of the problems and motivations, the engineering approach to Holistic Fault Tolerance is introduced and main engineering steps are defined. Next, an architectural pattern for Holistic Fault Tolerance is presented. The method to refine the proposed architecture and ensure efficiency of a particular system under development is demonstrated during the modelling step. Then the implementation of Holistic Fault Tolerance based on the proposed architecture and modelling is described in detail. Finally, the Holistic Fault Tolerance architecture is evaluated with regards to efficiency and maintainability. The evaluation demonstrates that Holistic Fault Tolerance assists in meeting the non-functional requirements, makes fault tolerance mechanisms easier to maintain and ensures higher modularity of the source cod

    Energy balance between voltage-frequency scaling and resilience for linear algebra routines on low-power multicore architectures

    Get PDF
    [EN] Near Threshold Voltage (NTV) computing has been recently proposed as a technique to save energy, at the cost of incurring higher error rates including, among others, Silent Data Corruption (SDC). In this paper, we evaluate the energy efficiency of dense linear algebra routines using several low-power multicore processors and we analyze whether the potential energy reduction achieved when scaling the processor to operate at a low voltage compensates the cost of integrating a fault tolerance mechanism that tackles SDC. Our study targets algorithmic-based fault-tolerant versions of the dense matrix-vector and matrix(matrix) multiplication kernels (GEMV and GEMM, respectively), using the BLIS framework, as well as an implementation of the LU factorization with partial pivoting built on top of GEMM, Furthermore, we tailor the study for a number of representative 32-bit and 64-bit multicore processors from ARM that were specifically designed for energy efficiency. (C) 2017 Elsevier B.V. All rights reserved.The researchers from Universidad Jaume I were supported by project CICYT TIN2014-53495-R of MINECO and FEDER, and the FPU program of MECD. The researcher from Universitat Politecnica de Catalunya was supported by projects TIN2015-65316-P from the Spanish Ministry of Education and 2014 SGR 1051 from the Generalitat de Catalunya, Dep. d'Innovacio, Universitats i Empresa.Catalán, S.; Herrero, JR.; Quintana Ortí, ES.; Rodríguez-Sánchez, R. (2018). Energy balance between voltage-frequency scaling and resilience for linear algebra routines on low-power multicore architectures. Parallel Computing. 73:28-39. https://doi.org/10.1016/j.parco.2017.05.004S28397

    Engineering Cross-Layer Fault Tolerance in Many-Core Systems.

    No full text

    Experience Report: Evaluation of Holistic Fault Tolerance

    No full text

    On Structuring Holistic Fault Tolerance

    No full text

    Modelling for systems with holistic fault tolerance

    Get PDF

    Architecting Holistic Fault Tolerance

    No full text
    corecore