1 research outputs found
Using machine learning techniques to evaluate multicore soft error reliability
Virtual platform frameworks have been extended
to allow earlier soft error analysis of more realistic multicore
systems (i.e., real software stacks, state-of-the-art ISAs). The
high observability and simulation performance of underlying
frameworks enable to generate and collect more error/failurerelated data, considering complex software stack configurations,
in a reasonable time. When dealing with sizeable failure-related
data sets obtained from multiple fault campaigns, it is essential to
filter out parameters (i.e., features) without a direct relationship
with the system soft error analysis. In this regard, this paper proposes the use of supervised and unsupervised machine learning
techniques, aiming to eliminate non-relevant information as well
as identify the correlation between fault injection results and
application and platform characteristics. This novel approach
provides engineers with appropriate means that able are able to
investigate new and more efficient fault mitigation techniques.
The underlying approach is validated with an extensive data set
gathered from more than 1.2 million fault injections, comprising
several benchmarks, a Linux OS and parallelization libraries
(e.g., MPI, OpenMP), as well as through a realistic automotive
case study