Log parsing, which extracts log templates from semi-structured logs and
produces structured logs, is the first and the most critical step in automated
log analysis. While existing log parsers have achieved decent results, they
suffer from two major limitations by design. First, they do not natively
support hybrid logs that consist of both single-line logs and multi-line logs
(\eg Java Exception and Hadoop Counters). Second, they fall short in
integrating domain knowledge in parsing, making it hard to identify ambiguous
tokens in logs. This paper defines a new research problem, \textit{hybrid log
parsing}, as a superset of traditional log parsing tasks, and proposes
\textit{Hue}, the first attempt for hybrid log parsing via a user-adaptive
manner. Specifically, Hue converts each log message to a sequence of special
wildcards using a key casting table and determines the log types via line
aggregating and pattern extracting. In addition, Hue can effectively utilize
user feedback via a novel merge-reject strategy, making it possible to quickly
adapt to complex and changing log templates. We evaluated Hue on three hybrid
log datasets and sixteen widely-used single-line log datasets (\ie Loghub). The
results show that Hue achieves an average grouping accuracy of 0.845 on hybrid
logs, which largely outperforms the best results (0.563 on average) obtained by
existing parsers. Hue also exhibits SOTA performance on single-line log
datasets. Furthermore, Hue has been successfully deployed in a real production
environment for daily hybrid log parsing.Comment: Accepted by ESEC/FSE 202