Leveraging Distributed Tracing and Container Cloning for Replay Debugging of Microservices

Mathur, Mihir

Leveraging Distributed Tracing and Container Cloning for Replay Debugging of Microservices

Authors: Mihir Mathur
Publication date: 1 January 2020
Publisher: eScholarship, University of California

Abstract

Microservice architectures have gained prominence in recent years for building large-scale industrial distributed systems. However, microservice architectures make the usage of replay debugging, a powerful technique for finding root causes of faults, very challenging because of the polyglot (written in several languages) services, large accumulated state of services, and tight latency limits imposed by long hop-chains. This work attempts to provide a framework for enabling replay debugging in production microservice applications. We study 25 real-world faults in microservice systems collected from diverse sources, categorize these faults by fault symptoms, and create 15 application agnostic mutation operators for microservices. We then propose a language agnostic replay debugging framework for microservice applications that uses a distributed tracing system to record network requests and enables replay of those requests on cloned service containers running in a debug environment. A key component of this framework is an anomaly detector that uses span-level and container-level monitoring to detect fault symptoms found in our study and localizes faults to trace level so that faulty traces can be easily replayed to find the root cause. An open-source microservices application injected successively with the mutation operators is used for an evaluation that shows that our framework is upto an order of magnitude lighter-weight than language-specific recording tools such as Chrome DevTools or VisualVM and can help in finding root causes of 9 out of 15 mutations at a line or function level

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Sustaining member

eScholarship - University of California

oai:escholarship.org:ark:/1303...

Last time updated on 25/12/2021