39,704 research outputs found
A runtime heuristic to selectively replicate tasks for application-specific reliability targets
In this paper we propose a runtime-based selective task replication technique for task-parallel high performance computing applications. Our selective task replication technique is automatic and does not require modification/recompilation of OS, compiler or application code. Our heuristic, we call App_FIT, selects tasks to replicate such that the specified reliability target for an application is achieved. In our experimental evaluation, we show that App FIT selective replication heuristic is low-overhead and highly scalable. In addition, results indicate that complete task replication is overkill for achieving reliability targets. We show that with App FIT, we can tolerate pessimistic exascale error rates with only 53% of the tasks being replicated.This work was supported by FI-DGR 2013 scholarship and the European Community’s
Seventh Framework Programme [FP7/2007-2013] under the Mont-blanc 2
Project (www.montblanc-project.eu), grant agreement no. 610402 and in part by the
European Union (FEDER funds) under contract TIN2015-65316-P.Peer ReviewedPostprint (author's final draft
Speeding up a scalable modular inversion hardware architecture
The modular inversion is a fundamental process in several cryptographic systems.
It can be computed in software or hardware, but hardware computation proven to be
faster and more secure. This research focused on improving an old scalable inversion
hardware architecture proposed in 2004 for finite field GF(p). The architecture has
been made of two parts, a computing unit and a memory unit. The memory unit is to
hold all the data bits of computation whereas the computing unit performs all the
arithmetic operations in word (digit) by word bases known as scalable method.
The main objective of this project was to investigate the cost and benefit of
modifying the memory unit to include parallel shifting, which was one of the tasks of
the scalable computing unit. The study included remodeling the entire hardware
architecture removing the shifter from the scalable computing part embedding it in
the memory unit instead. This modification resulted in a speedup to the complete
inversion process with an area increase due to the new memory shifting unit.
Quantitative measurements of the speed area trade-off have been investigated. The
results showed that the extra hardware to be added for this modification compared to
the speedup gained, giving the user the complete picture to choose from depending on
the application need.the British council in Saudi Arabia, KFUPM, Dr. Tatiana Kalganova at the Electrical &
Computer Engineering Department of Brunel University in Uxbridg
- …