A Large-scale Benchmark for Log Parsing

Chen, Zhuangbin; Gu, Jiazhen; Huang, Junjie; Huo, Yintong; Jiang, Zhihan; Li, Yichen; Liu, Jinyang; Lyu, Michael R.; Zhu, Jieming

A Large-scale Benchmark for Log Parsing

Authors: Zhuangbin Chen
Jiazhen Gu
Junjie Huang
Yintong Huo
Zhihan Jiang
Yichen Li
Jinyang Liu
Michael R. Lyu
Jieming Zhu
Publication date: 21 August 2023
Publisher

Abstract

Log data is pivotal in activities like anomaly detection and failure diagnosis in the automated maintenance of software systems. Due to their unstructured format, log parsing is often required to transform them into a structured format for automated analysis. A variety of log parsers exist, making it vital to benchmark these tools to comprehend their features and performance. However, existing datasets for log parsing are limited in terms of scale and representativeness, posing challenges for studies that aim to evaluate or develop log parsers. This problem becomes more pronounced when these parsers are evaluated for production use. To address these issues, we introduce a new collection of large-scale annotated log datasets, named LogPub, which more accurately mirrors log data observed in real-world software systems. LogPub comprises 14 datasets, each averaging 3.6 million log lines. Utilizing LogPub, we re-evaluate 15 log parsers in a more rigorous and practical setting. We also propose a new evaluation metric to lessen the sensitivity of current metrics to imbalanced data distribution. Furthermore, we are the first to scrutinize the detailed performance of log parsers on logs that represent rare system events and offer comprehensive information for system troubleshooting. Parsing such logs accurately is vital yet challenging. We believe that our work could shed light on the design and evaluation of log parsers in more realistic settings, thereby facilitating their implementation in production systems

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2308.10828

Last time updated on 24/08/2023