Search CORE

10 research outputs found

PROMON: a profile monitor of software applications

Author: Benso Alfredo
Di Carlo Stefano
Di Natale Giorgio
Prinetto Paolo Ernesto
Tagliaferri Luca
Tibaldi Clara
Publication venue: IEEE Computer Society
Publication date: 01/01/2005
Field of study

Software techniques can be efficiently used to increase the dependability of safety-critical applications. Many approaches are based on information redundancy to prevent data and code corruption during the software execution. This paper presents PROMON, a C++ library that exploits a new methodology based on the concept of "Programming by Contract" to detect system malfunctions. Resorting to assertions, pre- and post-conditions, and marginal programmer interventions, PROMON-based applications can reach high level of dependabilit

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Validation of a software dependability tool via fault injection experiments

Author: Benso Alfredo
Di Carlo Stefano
Di Natale Giorgio
Prinetto Paolo Ernesto
Tagliaferri Luca
Publication venue: IEEE Computer Society
Publication date: 01/01/2001
Field of study

Presents the validation of the strategies employed in the RECCO tool to analyze a C/C++ software; the RECCO compiler scans C/C++ source code to extract information about the significance of the variables that populate the program and the code structure itself. Experimental results gathered on an Open Source Router are used to compare and correlate two sets of critical variables, one obtained by fault injection experiments, and the other applying the RECCO tool, respectively. Then the two sets are analyzed, compared, and correlated to prove the effectiveness of RECCO's methodology

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Software dependability techniques validated via fault injection experiments

Author: Benso Alfredo
Di Carlo Stefano
Di Natale Giorgio
Prinetto Paolo Ernesto
Tagliaferri Luca
Publication venue: IEEE
Publication date: 01/01/2001
Field of study

The present paper proposes a C/C++ source-to-source compiler able to increase the dependability properties of a given application. The adopted strategy is based on two main techniques: variable duplication/triplication and control flow checking. The validation of these techniques is based on the emulation of fault appearance by software fault injection. The chosen test case is a client-server application in charge of calculating and drawing a Mandelbrot fracta

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Automated Synthesis of SEU Tolerant Architectures from OO Descriptions

Author: Chiusano Silvia Anna
Di Carlo Stefano
Prinetto Paolo Ernesto
Publication venue: IEEE Computer Society
Publication date: 01/01/2002
Field of study

SEU faults are a well-known problem in aerospace environment but recently their relevance grew up also at ground level in commodity applications coupled, in this frame, with strong economic constraints in terms of costs reduction. On the other hand, latest hardware description languages and synthesis tools allow reducing the boundary between software and hardware domains making the high-level descriptions of hardware components very similar to software programs. Moving from these considerations, the present paper analyses the possibility of reusing Software Implemented Hardware Fault Tolerance (SIHFT) techniques, typically exploited in micro-processor based systems, to design SEU tolerant architectures. The main characteristics of SIHFT techniques have been examined as well as how they have to be modified to be compatible with the synthesis flow. A complete environment is provided to automate the design instrumentation using the proposed techniques, and to perform fault injection experiments both at behavioural and gate level. Preliminary results presented in this paper show the effectiveness of the approach in terms of reliability improvement and reduced design effort

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

PROMON: a profile monitor of software applications

Author: Benso A.
Di Carlo S.
Di Natale G.
Prinetto P.
Tagliaferri L.
Tibaldi C.
Publication venue: IEEE Computer Society
Publication date
Field of study

Software techniques can be efficiently used to increase the dependability of safety-critical applications. Many approaches are based on information redundancy to prevent data and code corruption during the software execution. This paper presents PROMON, a C++ library that exploits a new methodology based on the concept of “Programming by Contract” to detect system malfunctions. Resorting to assertions, pre- and post-conditions, and marginal programmer interventions, PROMON-based applications can reach high level of dependability

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

CPPC: a compiler‐assisted tool for portable checkpointing of message‐passing applications

Author: Doallo Ramón
González Patricia
Martín María J.
Rodríguez Gabriel
Touriño Juan
Publication venue: 'Wiley'
Publication date: 01/01/2010
Field of study

This is the peer reviewed version of the following article: Rodríguez, G. , Martín, M. J., González, P. , Touriño, J. and Doallo, R. (2010), CPPC: a compiler‐assisted tool for portable checkpointing of message‐passing applications. Concurrency Computat.: Pract. Exper., 22: 749-766. doi:10.1002/cpe.1541, which has been published in final form at https://doi.org/10.1002/cpe.1541. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.[Abstract] With the evolution of high‐performance computing toward heterogeneous, massively parallel systems, parallel applications have developed new checkpoint and restart necessities. Whether due to a failure in the execution or to a migration of the application processes to different machines, checkpointing tools must be able to operate in heterogeneous environments. However, some of the data manipulated by a parallel application are not truly portable. Examples of these include opaque state (e.g. data structures for communications support) or diversity of interfaces for a single feature (e.g. communications, I/O). Directly manipulating the underlying ad hoc representations renders checkpointing tools unable to work on different environments. Portable checkpointers usually work around portability issues at the cost of transparency: the user must provide information such as what data need to be stored, where to store them, or where to checkpoint. CPPC (ComPiler for Portable Checkpointing) is a checkpointing tool designed to feature both portability and transparency. It is made up of a library and a compiler. The CPPC library contains routines for variable level checkpointing, using portable code and protocols. The CPPC compiler helps to achieve transparency by relieving the user from time‐consuming tasks, such as data flow and communications analyses and adding instrumentation code. This paper covers both the operation of the CPPC library and its compiler support. Experimental results using benchmarks and large‐scale real applications are included, demonstrating usability, efficiency, and portability.Miniesterio de Educación y Ciencia; TIN2007‐67537‐C03Xunta de Galicia; 2006/

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Flexible Rollback Recovery in Dynamic Heterogeneous Grid Computing

Author: Axel Krings
Samir Jafar
Senior Member
Thierry Gautier
Publication venue
Publication date
Field of study

Abstract—Large applications executing on Grid or cluster architectures consisting of hundreds or thousands of computational nodes create problems with respect to reliability. The source of the problems are node failures and the need for dynamic configuration over extensive runtime. This paper presents two fault-tolerance mechanisms called Theft-Induced Checkpointing and Systematic Event Logging. These are transparent protocols capable of overcoming problems associated with both benign faults, i.e., crash faults, and node or subnet volatility. Specifically, the protocols base the state of the execution on a dataflow graph, allowing for efficient recovery in dynamic heterogeneous systems as well as multithreaded applications. By allowing recovery even under different numbers of processors, the approaches are especially suitable for applications with a need for adaptive or reactionary configuration control. The low-cost protocols offer the capability of controlling or bounding the overhead. A formal cost model is presented, followed by an experimental evaluation. It is shown that the overhead of the protocol is very small, and the maximum work lost by a crashed process is small and bounded. Index Terms—Grid computing, rollback recovery, checkpointing, event logging. Ç

CiteSeerX

Recommended from our members

Transiently Powered Computers

Author: Ransford Benjamin
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2013
Field of study

Demand for compact, easily deployable, energy-efficient computers has driven the development of general-purpose transiently powered computers (TPCs) that lack both batteries and wired power, operating exclusively on energy harvested from their surroundings. TPCs\u27 dependence solely on transient, harvested power offers several important design-time benefits. For example, omitting batteries saves board space and weight while obviating the need to make devices physically accessible for maintenance. However, transient power may provide an unpredictable supply of energy that makes operation difficult. A predictable energy supply is a key abstraction underlying most electronic designs. TPCs discard this abstraction in favor of opportunistic computation that takes advantage of available resources. A crucial question is how should a software-controlled computing device operate if it depends completely on external entities for power and other resources? The question poses challenges for computation, communication, storage, and other aspects of TPC design. The main idea of this work is that software techniques can make energy harvesting a practicable form of power supply for electronic devices. Its overarching goal is to facilitate the design and operation of usable TPCs. This thesis poses a set of challenges that are fundamental to TPCs, then pairs these challenges with approaches that use software techniques to address them. To address the challenge of computing steadily on harvested power, it describes Mementos, an energy-aware state-checkpointing system for TPCs. To address the dependence of opportunistic RF-harvesting TPCs on potentially untrustworthy RFID readers, it describes CCCP, a protocol and system for safely outsourcing data storage to RFID readers that may attempt to tamper with data. Additionally, it describes a simulator that facilitates experimentation with the TPC model, and a prototype computational RFID that implements the TPC model. To show that TPCs can improve existing electronic devices, this thesis describes applications of TPCs to implantable medical devices (IMDs), a challenging design space in which some battery-constrained devices completely lack protection against radio-based attacks. TPCs can provide security and privacy benefits to IMDs by, for instance, cryptographically authenticating other devices that want to communicate with the IMD before allowing the IMD to use any of its battery power. This thesis describes a simplified IMD that lacks its own radio, saving precious battery energy and therefore size. The simplified IMD instead depends on an RFID-scale TPC for all of its communication functions. TPCs are a natural area of exploration for future electronic design, given the parallel trends of energy harvesting and miniaturization. This work aims to establish and evaluate basic principles by which TPCs can operate

ScholarWorks@UMass Amherst

Técnicas de ponto de controlo e adaptação em grelhas computacionais

Author: Medeiros Bruno Silvestre
Publication venue
Publication date: 26/09/2011
Field of study

Dissertação de mestrado em Engenharia de InformáticaA recente popularidade dos ambientes de grelhas introduziu a necessidade de suportar a execução robusta de aplicações numa gama alargada de recursos computacionais. Em contextos de grelhas computacionais, onde a fiabilidade e disponibilidade dos recursos não é garantida, as aplicações deverão ser capazes de suportar dois requisitos fundamentais: 1) tolerância a faltas; 2) adaptação aos recursos disponíveis. As técnicas tradicionais utilizam uma abordagem "caixa-negra", onde a camada intermédia de software (mediador) é a única responsável por assegurar estes dois requisitos. Estes tipos de abordagens possibilitam o suporte a estes serviços com uma intervenção mínima do programador, mas limitam a utilização de conhecimento sobre as características da aplicação, visando a otimização destes serviços. Nesta tese são apresentadas abordagens orientadas aos aspetos para suportar tolerância a faltas e adaptação dinâmica aos recursos em grelhas computacionais. Nas abordagens propostas, as aplicações são aprimoradas com capacidades de tolerância a faltas e de adaptação dinâmica através da ativação de módulos adicionais. A abordagem de tolerância a faltas utiliza a estratégia de ponto de controlo e restauro, enquanto a adaptação dinâmica utiliza uma variação da técnica de sobre-decomposição. Ambas são portáveis entre sistemas operativos e restringem a quantidade de alterações necessárias no código base das aplicações. Além disso, as aplicações poderão adaptar de uma execução sequencial para uma configuração multi-cluster. A adaptação pode ser realizada efetuando o ponto de controlo da aplicação e restaurando-a em diferentes máquinas, ou então, realizada em plena execução da aplicação.Grids’ recent popularity introduced the necessity of supporting robust execution of applications on a wide range of computing resources. In computational grids’ context, where reliability and availability are not granted, applications must support two fundamental requirements, namely, fault tolerance and adaptation to available resources. Traditional techniques use a "black-box"approach, where middleware is the only sponsor for those requirements. These kind of approaches enable this services’ support with a minimum programmer’s intervention, but limits knowledge utilization of application’s features in order to optimize services. This thesis presents aspect-oriented approaches to support fault tolerance and dynamic adaptation to resources in computational grids. In the proposed approaches, applications are enhanced with the ability of fault tolerance and dynamic adaptation through additional modules activation. Fault tolerance approach uses a check point and restore strategy while dynamic adaptation uses a variation of the over-decomposition technique. Both are portable between operating systems and minimize alterations to base code of applications. Moreover, applications can adapt from a sequential execution to a multi-cluster configuration. Adaption can be performed by checkpointing the application and restarting on a different mode or can be performed during run-time

Universidade do Minho: RepositoriUM

Compiler assisted chekpointing of message-passing applications in heterogeneous environments

Author: Rodríguez Gabriel
Publication venue
Publication date: 01/01/2008
Field of study

[Resumen] With the evolution of high performance computing towards heterogeneous, massively parallel systems, parallel applications have developed new checkpoint and restart necessities, Whether due to a failure in the execution or to a migration of the processes to different machines, checkpointing tools must be able to operate in heterogeneous environments. However, some of the data manipulated by a parallel application are not truly portable. Examples of these include opaque state (e.g. data structures for communications support) or diversity of interfaces for a single feature (e.g. communications, I/O). Directly manipulating the underlying ad-hoc representations renders checkpointing tools incapable of working on different environments. Portable checkpointers usually work around portability issues at the cost of transparency: the user must provide information such as what data needs to be stored, where to store it, or where to checkpoint. CPPC (ComPiler for Portable Checkpointing) is a checkpointing tool designed to feature both portability and transparency, while preserving the scalability of the executed applications. It is made up of a library and a compiler. The CPPC library contains routines for variable level checkpointing, using portable code and protocols. The CPPC compiler achieves transparency by relieving the user from time-consuming tasks, such as performing code analyses and adding instrumentation code

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas