Distributed server systems allow many configuration settings and support various application workloads. Often performance anomalies, situations where actual performance falls below expectations, only manifest under particular runtime conditions. This paper presents a new approach to examine a large space of potential runtime conditions and to comprehensively depict the conditions under which performance anomalies are likely to occur. In our approach, we derive our performance expectations from a hierarchy of sub-models in which each submodel can be independently adjusted to consider new runtime conditions. We then produce a representative set of measured runtime condition samples (both normal and abnormal) with carefully chosen sample size and anomaly error threshold. Finally, we employ decision tree based classification to produce an easy-to-interpret depiction of the entire space of potential runtime conditions. Our depictions can be used to guide the avoidance of anomaly-inducing system configurations and it can also assist root cause diagnosis and performance debugging. We present preliminary experimental results with a real J2EE middleware system.
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.