Search CORE

2 research outputs found

Portable Application-level Checkpointing for Hybrid MPI-OpenMP Applications

Author: González Patricia
Losada Nuria
Martín María J.
Rodríguez Gabriel
Publication venue: The Author(s). Published by Elsevier B.V.
Publication date: 31/12/2016
Field of study

AbstractAs parallel machines increase their number of processors, so does the failure rate of the global system, thus, long-running applications will need to make use of fault tolerance techniques to ensure the successful execution completion. Most of current HPC systems are built as clusters of multicores. The hybrid MPI-OpenMP paradigm provides numerous benefits on these systems. This paper presents a checkpointing solution for hybrid MPI-OpenMP applications, in which checkpoint consistency is guaranteed by using a coordination protocol intra-node, while no inter-node coordination is needed. The proposal reduces network utilization and storage resources in order to optimize the I/O cost of fault tolerance, while minimizing the checkpointing overhead. Besides, the portability of the solution and the dynamic parallelism provided by OpenMP enable the restart of the applications using machines with different architectures, operating systems and/or number of cores, adapting the number of running OpenMP threads for the best exploitation of the available resources. Extensive evaluation using hybrid MPI-OpenMP applications from the ASC Sequoia Benchmark Codes and NERSC-8/Trinity benchmarks is presented, showing the effectiveness and efficiency of the approach

Elsevier - Publisher Connector

Static analysis for facilitating secure and reliable software

Author: Siavvas Miltiadis
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/06/2019
Field of study

Software security and reliability are aspects of major concern for software development enterprises that wish to deliver dependable software to their customers. Several static analysis-based approaches for facilitating the development of secure and reliable software have been proposed over the years. The purpose of the present thesis is to investigate these approaches and to extend their state of the art by addressing existing open issues that have not been sufficiently addressed yet. To this end, an empirical study was initially conducted with the purpose to investigate the ability of software metrics (e.g., complexity metrics) to discriminate between different types of vulnerabilities, and to examine whether potential interdependencies exist between different vulnerability types. The results of the analysis revealed that software metrics can be used only as weak indicators of specific security issues, while important interdependencies may exist between different types of vulnerabilities. The study also verified the capacity of software metrics (including previously uninvestigated metrics) to indicate the existence of vulnerabilities in general. Subsequently, a hierarchical security assessment model able to quantify the internal security level of software products, based on static analysis alerts and software metrics is proposed. The model is practical, since it is fully-automated and operationalized in the form of individual tools, while it is also sufficiently reliable since it was built based on data and well-accepted sources of information. An extensive evaluation of the model on a large volume of empirical data revealed that it is able to reliably assess software security both at product- and at class-level of granularity, with sufficient discretion power, while it may be also used for vulnerability prediction. The experimental results also provide further support regarding the ability of static analysis alerts and software metrics to indicate the existence of software vulnerabilities. Finally, a mathematical model for calculating the optimum checkpoint interval, i.e., the checkpoint interval that minimizes the execution time of software programs that adopt the application-level checkpoint and restart (ALCR) mechanism was proposed. The optimum checkpoint interval was found to depend on the failure rate of the application, the execution cost for establishing a checkpoint, and the execution cost for restarting a program after failure. Emphasis was given on programs with loops, while the results were illustrated through several numerical examples.Open Acces

Spiral - Imperial College Digital Repository