2 research outputs found
Fault Injection Analytics: A Novel Approach to Discover Failure Modes in Cloud-Computing Systems
Cloud computing systems fail in complex and unexpected ways due to unexpected
combinations of events and interactions between hardware and software
components. Fault injection is an effective means to bring out these failures
in a controlled environment. However, fault injection experiments produce
massive amounts of data, and manually analyzing these data is inefficient and
error-prone, as the analyst can miss severe failure modes that are yet unknown.
This paper introduces a new paradigm (fault injection analytics) that applies
unsupervised machine learning on execution traces of the injected system, to
ease the discovery and interpretation of failure modes. We evaluated the
proposed approach in the context of fault injection experiments on the
OpenStack cloud computing platform, where we show that the approach can
accurately identify failure modes with a low computational cost.Comment: IEEE Transactions on Dependable and Secure Computing; 16 pages. arXiv
admin note: text overlap with arXiv:1908.1164