36 research outputs found
High Sensitive Capacitive Sensing Method for Thickness Detection of the Water Film on an Insulation Surface
Automatic Root Cause Analysis via Large Language Models for Cloud Incidents
Ensuring the reliability and availability of cloud services necessitates
efficient root cause analysis (RCA) for cloud incidents. Traditional RCA
methods, which rely on manual investigations of data sources such as logs and
traces, are often laborious, error-prone, and challenging for on-call
engineers. In this paper, we introduce RCACopilot, an innovative on-call system
empowered by the large language model for automating RCA of cloud incidents.
RCACopilot matches incoming incidents to corresponding incident handlers based
on their alert types, aggregates the critical runtime diagnostic information,
predicts the incident's root cause category, and provides an explanatory
narrative. We evaluate RCACopilot using a real-world dataset consisting of a
year's worth of incidents from Microsoft. Our evaluation demonstrates that
RCACopilot achieves RCA accuracy up to 0.766. Furthermore, the diagnostic
information collection component of RCACopilot has been successfully in use at
Microsoft for over four years