1 research outputs found
TransRegex: Multi-modal Regular Expression Synthesis by Generate-and-Repair
Since regular expressions (abbrev. regexes) are difficult to understand and
compose, automatically generating regexes has been an important research
problem. This paper introduces TransRegex, for automatically constructing
regexes from both natural language descriptions and examples. To the best of
our knowledge, TransRegex is the first to treat the NLP-and-example-based regex
synthesis problem as the problem of NLP-based synthesis with regex repair. For
this purpose, we present novel algorithms for both NLP-based synthesis and
regex repair. We evaluate TransRegex with ten relevant state-of-the-art tools
on three publicly available datasets. The evaluation results demonstrate that
the accuracy of our TransRegex is 17.4%, 35.8% and 38.9% higher than that of
NLP-based approaches on the three datasets, respectively. Furthermore,
TransRegex can achieve higher accuracy than the state-of-the-art multi-modal
techniques with 10% to 30% higher accuracy on all three datasets. The
evaluation results also indicate TransRegex utilizing natural language and
examples in a more effective way.Comment: accepted by ICSE 202