11,548 research outputs found

    Enhancing Failure Propagation Analysis in Cloud Computing Systems

    Full text link
    In order to plan for failure recovery, the designers of cloud systems need to understand how their system can potentially fail. Unfortunately, analyzing the failure behavior of such systems can be very difficult and time-consuming, due to the large volume of events, non-determinism, and reuse of third-party components. To address these issues, we propose a novel approach that joins fault injection with anomaly detection to identify the symptoms of failures. We evaluated the proposed approach in the context of the OpenStack cloud computing platform. We show that our model can significantly improve the accuracy of failure analysis in terms of false positives and negatives, with a low computational cost.Comment: 12 pages, The 30th International Symposium on Software Reliability Engineering (ISSRE 2019

    Fault Injection Analytics: A Novel Approach to Discover Failure Modes in Cloud-Computing Systems

    Full text link
    Cloud computing systems fail in complex and unexpected ways due to unexpected combinations of events and interactions between hardware and software components. Fault injection is an effective means to bring out these failures in a controlled environment. However, fault injection experiments produce massive amounts of data, and manually analyzing these data is inefficient and error-prone, as the analyst can miss severe failure modes that are yet unknown. This paper introduces a new paradigm (fault injection analytics) that applies unsupervised machine learning on execution traces of the injected system, to ease the discovery and interpretation of failure modes. We evaluated the proposed approach in the context of fault injection experiments on the OpenStack cloud computing platform, where we show that the approach can accurately identify failure modes with a low computational cost.Comment: IEEE Transactions on Dependable and Secure Computing; 16 pages. arXiv admin note: text overlap with arXiv:1908.1164

    Fine-Grained Reliability for V2V Communications around Suburban and Urban Intersections

    Full text link
    Safe transportation is a key use-case of the 5G/LTE Rel.15+ communications, where an end-to-end reliability of 0.99999 is expected for a vehicle-to-vehicle (V2V) transmission distance of 100-200 m. Since communications reliability is related to road-safety, it is crucial to verify the fulfillment of the performance, especially for accident-prone areas such as intersections. We derive closed-form expressions for the V2V transmission reliability near suburban corners and urban intersections over finite interference regions. The analysis is based on plausible street configurations, traffic scenarios, and empirically-supported channel propagation. We show the means by which the performance metric can serve as a preliminary design tool to meet a target reliability. We then apply meta distribution concepts to provide a careful dissection of V2V communications reliability. Contrary to existing work on infinite roads, when we consider finite road segments for practical deployment, fine-grained reliability per realization exhibits bimodal behavior. Either performance for a certain vehicular traffic scenario is very reliable or extremely unreliable, but nowhere in relatively proximity to the average performance. In other words, standard SINR-based average performance metrics are analytically accurate but can be insufficient from a practical viewpoint. Investigating other safety-critical point process networks at the meta distribution-level may reveal similar discrepancies.Comment: 27 pages, 6 figures, submitted to IEEE Transactions on Wireless Communication
    • …
    corecore