1 research outputs found
Summarizing Unstructured Logs in Online Services
Logs are one of the most valuable data sources for managing large-scale
online services. After a failure is detected/diagnosed/predicted, operators
still have to inspect the raw logs to gain a summarized view before take
actions. However, manual or rule-based log summarization has become inefficient
and ineffective. In this work, we propose LogSummary, an automatic,
unsupervised end-to-end log summarization framework for online services.
LogSummary obtains the summarized triples of important logs for a given log
sequence. It integrates a novel information extraction method taking both
semantic information and domain knowledge into consideration, with a new triple
ranking approach using the global knowledge learned from all logs. Given the
lack of a publicly-available gold standard for log summarization, we have
manually labelled the summaries of four open-source log datasets and made them
publicly available. The evaluation on these datasets as well as the case
studies on real-world logs demonstrate that LogSummary produces a highly
representative (average ROUGE F1 score of 0.741) summaries. We have packaged
LogSummary into an open-source toolkit and hope that it can benefit for future
NLP-powered summarization works