Search CORE

386 research outputs found

Solving multiprocessor drawbacks with kilo-instruction processors

Author: Beivide Palacio Ramon
Cristal Kestelman Adrián
Galluzzi Marco
Smith James E.
Stenström Per
Valero Cortés Mateo
Vallejo Enrique
Vallejo Fernando
Publication venue
Publication date: 01/01/2005
Field of study

Nowadays, a good multiprocessor system design has to deal with many drawbacks in order to achieve a good tradeoff between complexity and performance. For example, while solving problems like coherence and consistency is essential for correctness the way to solve processor stalls due to critical sections and synchronization points is desirable for performance. And none of these drawbacks has a straightforward solution. We show in our paper how the multi-checkpointing mechanism of the Kilo-Instruction Processors can be correctly leveraged in order to achieve a good complexity-effective multiprocessor design. Specifically, we describe a Kilo-Instruction Multiprocessor that transparently, i.e. without any software support, uses transaction-based memory updates. Our model simplifies the coherence and consistency hardware and gives the potential for easily applying different desirable speculative mechanisms to enhance performance when facing some synchronization constructs of current parallel applications.Postprint (published version

Energy reduction in multiprocessor systems using transactional memory

Author: M. Herlihy
null Tali Moreshet
R.I. Bahar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Radiation Testing of a Multiprocessor Macrosynchronized Lockstep Architecture With FreeRTOS

Author: Avilés Pablo M.
Belloch Rodríguez José Antonio
Entrena Arrontes Luis Alfonso
García Valderas Mario
Lindoso Muñoz Almudena
Morilla Yolanda
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/11/2021
Field of study

Nowadays, high-performance microprocessors are demanded in many fields, including those with high-reliability requirements. Commercial microprocessors present a good tradeoff between cost, size, and performance, albeit they must be adapted to satisfy the reliability requirements when they are used in harsh environments. This work presents a high-end multiprocessor hardened with macrosynchronized lockstep and additional protections. A commercial dual-core Advanced RISC Machine (ARM) cortex A9 has been used as a case study and a complete hardened system has been developed. Evaluation of the proposed hardened system has been accomplished with exhaustive fault injection campaigns and proton irradiation. The hardening approach has been accomplished for both baremetal applications and operating system (OS)-based. The hardened system has demonstrated high reliability in all performed experiments with error coverage up to 99.3% in the irradiation experiments. Experimental irradiation results demonstrate a cross-sectional reduction of two orders of magnitude.This work was supported in part by the Spanish Ministry of Science and Innovation under Project PID2019-106455GB-C21 and in part by the Community of Madrid under Project 49.520608.9.18Publicad

Universidad Carlos III de Madrid e-Archivo

Distributed Recovery in Applicative Systems

Author: Keller Robert M.
Lin Frank C. H.
Publication venue: Scholarship @ Claremont
Publication date: 01/08/1986
Field of study

Applicative systems are promising candidates for achieving high performance computing through aggregation of processors. This paper studies the fault recovery problems in a class of applicative systems. The concept of functional checkpointing is proposed as the nucleus of a distributed recovery mechanism. This entails incrementally building a resilient structure as the evaluation of an applicative program proceeds. A simple rollback algorithm is suggested to regenerate the corrupted structure by redoing the most effective functional checkpoints. Another algorithm, which attempts to recover intermediate results, is also presented. The parent of a faulty task reproduces a functional twin of the failed task. The regenerated task inherits all offspring of the faulty task so that partial results can be salvaged

Checkpoint-based forward recovery using lookahead execution and rollback validation in parallel and distributed systems

Author: Long Junsheng
Publication venue
Publication date
Field of study

This thesis studies a forward recovery strategy using checkpointing and optimistic execution in parallel and distributed systems. The approach uses replicated tasks executing on different processors for forwared recovery and checkpoint comparison for error detection. To reduce overall redundancy, this approach employs a lower static redundancy in the common error-free situation to detect error than the standard N Module Redundancy scheme (NMR) does to mask off errors. For the rare occurrence of an error, this approach uses some extra redundancy for recovery. To reduce the run-time recovery overhead, look-ahead processes are used to advance computation speculatively and a rollback process is used to produce a diagnosis for correct look-ahead processes without rollback of the whole system. Both analytical and experimental evaluation have shown that this strategy can provide a nearly error-free execution time even under faults with a lower average redundancy than NMR

Parallel algorithms for simulating continuous time Markov chains

Author: Heidelberger Philip
Nicol David M.
Publication venue
Publication date
Field of study

We have previously shown that the mathematical technique of uniformization can serve as the basis of synchronization for the parallel simulation of continuous-time Markov chains. This paper reviews the basic method and compares five different methods based on uniformization, evaluating their strengths and weaknesses as a function of problem characteristics. The methods vary in their use of optimism, logical aggregation, communication management, and adaptivity. Performance evaluation is conducted on the Intel Touchstone Delta multiprocessor, using up to 256 processors