To accomplish punctuation restoration, most existing methods focus on
introducing extra information (e.g., part-of-speech) or addressing the class
imbalance problem. Recently, large-scale transformer-based pre-trained language
models (PLMS) have been utilized widely and obtained remarkable success.
However, the PLMS are trained on the large dataset with marks, which may not
fit well with the small dataset without marks, causing the convergence to be
not ideal. In this study, we propose a Feature Fusion two-stream framework
(FF2) to bridge the gap. Specifically, one stream leverages a pre-trained
language model to capture the semantic feature, while another auxiliary module
captures the feature at hand. We also modify the computation of multi-head
attention to encourage communication among heads. Then, two features with
different perspectives are aggregated to fuse information and enhance context
awareness. Without additional data, the experimental results on the popular
benchmark IWSLT demonstrate that FF2 achieves new SOTA performance, which
verifies that our approach is effective.Comment: 5pages. arXiv admin note: substantial text overlap with
arXiv:2203.1248