Search CORE

36 research outputs found

Evaluating the Performance Impact of Xen on MPI and Process Execution For HPC Systems

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Department of Computer Science Activity 1998-2004

Author: Kotz David
Publication venue: Dartmouth Digital Commons
Publication date: 20/03/2005
Field of study

This report summarizes much of the research and teaching activity of the Department of Computer Science at Dartmouth College between late 1998 and late 2004. The material for this report was collected as part of the final report for NSF Institutional Infrastructure award EIA-9802068, which funded equipment and technical staff during that six-year period. This equipment and staff supported essentially all of the department\u27s research activity during that period

Dartmouth Digital Commons (Dartmouth College)

ESG-CET Final Progress Title

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

Convergence of Intelligent Data Acquisition and Advanced Computing Systems

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

This book is a collection of published articles from the Sensors Special Issue on "Convergence of Intelligent Data Acquisition and Advanced Computing Systems". It includes extended versions of the conference contributions from the 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS’2019), Metz, France, as well as external contributions

Directory of Open Access Books (DOAB)

Fault-tolerance and malleability in parallel message-passing applications

Author: Cores González Iván
Publication venue
Publication date: 01/01/2015
Field of study

[Resumo] Esta tese explora solucións para tolerancia a fallos e maleabilidade baseadas en técnicas de checkpoint e reinicio para aplicacións de pase de mensaxes. No campo da tolerancia a fallos, esta tese contribúe melloraudo o factor que máis incrementa a sobrecarga, o custo de E/S no envorcado dos ficheiros de estado, propoñendo diferentes técnicas para reducir o tamaño dos ficheiros de checkpoint. Ademais, tamén se propón un mecanismo de migración de procesos baseado en checkpointing. Esto permite a migración proactiva de procesos desde nodos que están a piques de fallar, evitando un reinicio completo da execución e melloraudo a resistencia a fallos da aplicación. Finalmente, esta tese presenta unha proposta para transformar de forma transparente aplicacións MPI en traballos maleables. Esto é, programas paralelos que en tempo de execución son capaces de adaptarse so número de procesadores dispoñibles no sistema, conseguindo beneficios, como maior productividade, mellor tempo de resposta ou maior resistencia a fallos nos nodos. Todas as solucióru; propostas nesta tese foron implementadas a nivel de aplicación, e son independentes da arquitectura hardware, o sistema operativo, a implementación MPI usada, e de calquera framework de alto nivel, como os utilizados para o envío de traballos.[Resumen] Esta tesis explora soluciones de tolerancia a fallos y maleabilidad basadas en técnicas de checkpoint y reinicio para aplicaciones de pase de mensajes. En el campo de la tolerancia a fallos, contribuye mejorando el factor que más incrementa la sobrecarga, el coste de E/S en el volcado de los ficheros de estado, proponiendo diferentes técnicas para reducir el tamaño de los ficheros de checkpoint. Ademós, también se propone nn mecanismo de migración de procesos basado en checkpointing. Esto permite la migración proactiva de procesos desde nodos que están a punto de fallar, evitando un reinicio completo de la ejecución y mejorando la resistencia a fallos de la aplicación. Finalmente, se presenta una propuesta para transformar de forma transparente aplicaciones MPI en trabajos maleables. Esto es, programas paralelos que en tiempo de ejecución son capaces de adaptarse al número de procesadores disponibles en el sistema, consiguiendo beneficios, como mayor productividad, mejor tiempo de respuesta y mayor resistencia a fallos en los nodos. Todas las soluciones propuestas han sido implementadas a nivel de aplicación, siendo independientes de la arquitectura hardware, el sistema operativo, la implementación MPI usada y de cualquier framework de alto nivel, como los utilizados para el envío de trabajos.[Abstract] This Thesis focuses on exploring fault-tolerant and malleability solutions, based on checkpoint and restart techniques, for parallel message-passing applications. In the fault-tolerant field, tbis Thesis contributes to improving the most important overhead factor in checkpointing perfonnance, that is, the I/O cost of the state file dumping, through the proposal of different techniques to reduce the checkpoint file size. In addition, a process migration based on checkpointing is also proposed, that allows for proactively migrating processes fram nades that are about to fail, avoiding the complete restart of the execution and, thus, improving the application resilience. Finally, this Thesis also includes a proposal to transparently transform MPI applications into malleable jobs, that is, parallel programs that are able to adapt their execution to the number of available processors at runtime, which provides important benefits for the end users and the whole system, such as higher productivity and a better response time, or a greater resilience to node failures. All the solutions proposed in this Thesis have been implemented at the application-level, and they are independent of the hardware architecture, the operating system, or the MPI implementation used, and of any higher-level frameworks, such as job submission frameworks

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Applications of Power Electronics:Volume 2

Author
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

VBN