Search CORE

5 research outputs found

Fast and transparent recovery for continuous availability of cluster-based servers

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

A Recoverable Distributed Shared Memory Integrating Coherence and Recoverability

Author: Alain Gefflaut
Anne-Marie Kermarrec
Christine Morin
Gilbert Cabillic
Isabelle Puaut
Publication venue
Publication date: 01/01/1995
Field of study

... this paper, we address this problem and propose a checkpointing mechanism relying on a recoverable distributed shared memory (DSM) in order to tolerate single node failures. Although most recoverable DSMs require specific hardware to store recovery data, our scheme uses standard memories to store both current and recovery data. Moreover, the management of recovery data is merged with the management of current data by extending the DSM's coherence protocol. This approach takes advantage of the data replication provided by a DSM in order to limit the amount of transferred pages during the checkpointing. The paper also presents an implementation and a preliminary performance evaluation of our recoverable DSM on a 56 nodes Intel Paragon

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1

A recoverable distributed shared memory integrating coherence and recoverability

Author: Cabillic Gilbert
Centre National de la Recherche Scientifique (CNRS) 35 - Rennes (France). Inst. de Recherche en Informatique et Systemes Aleatoires (IRISA)
Gefflaut Alain
Institut National de Recherche en Informatique et en Automatique (INRIA) 35 - Rennes (France). Inst. de Recherche en Informatique et Systemes Aleatoires (IRISA)
Institut National des Sciences Appliquees de Rennes (INSA) 35 (France). Inst. de Recherche en Informatique et Systemes Aleatoires (IRISA)
Kermarrec Anne-Marie
Morin Christine
Puaut Isabelle
Rennes-1 Univ. 35 (France). Inst. de Recherche en Informatique et Systemes Aleatoires (IRISA)
Publication venue
Publication date: 01/01/1995
Field of study

Programme 1 - Architectures paralleles, bases de donnees, reseaux et systemes distribues. Projet SOLIDORSIGLEAvailable at INIST (FR), Document Supply Service, under shelf-number : 22588, issue : a.1995 n.897 / INIST-CNRS - Institut de l'Information Scientifique et TechniqueFRFranc

OpenGrey Repository

A Recoverable Distributed Shared Memory Integrating Coherence and Recoverability

Author: Alain Gefflaut
Anne-Marie Kermarrec
Christine Morin
Christine Morin
Gilbert Cabillic
Isabelle Puaut
Isabelle Puaut
Projet Solidor
Publication venue
Publication date
Field of study

: Large-scale distributed systems are very attractive for the execution of parallel applications requiring a huge computing power. However, their high probability of site failure is unacceptable, especially for long time running applications. In this paper, we address this problem and propose a checkpointing mechanism relying on a recoverable distributed shared memory (DSM). Although most recoverable DSM require specific hardware to store recovery data, our scheme uses standard memories to store both current and recovery data. Moreover, the management of recovery data is merged with the management of current data by extending the DSM's coherence protocol. This approach limits the hardware development and takes advantage of the data replication provided by a DSM in order to limit the amount of transferred pages during the checkpointing. The paper also presents an implementation and preliminary performances evaluation of our recoverable DSM on an Intel Paragon with 56 nodes. In particula..

CiteSeerX

A Recoverable Distributed Shared Memory Integrating Coherence and Recoverability

Author: Anne-Marie Kermarrec
Christine Morin
Christine Morin
Isabelle Puaut
Isabelle Puaut
Projet Solidor
Publication venue
Publication date
Field of study

CiteSeerX