Search CORE

2 research outputs found

Unified Multi-Criteria Chinese Word Segmentation with BERT

Author: Huang Xuanjing
Ke Zhen
Meng Erli
Qiu Xipeng
Shi Liang
Wang Bin
Publication venue
Publication date: 13/04/2020
Field of study

Multi-Criteria Chinese Word Segmentation (MCCWS) aims at finding word boundaries in a Chinese sentence composed of continuous characters while multiple segmentation criteria exist. The unified framework has been widely used in MCCWS and shows its effectiveness. Besides, the pre-trained BERT language model has been also introduced into the MCCWS task in a multi-task learning framework. In this paper, we combine the superiority of the unified framework and pretrained language model, and propose a unified MCCWS model based on BERT. Moreover, we augment the unified BERT-based MCCWS model with the bigram features and an auxiliary criterion classification task. Experiments on eight datasets with diverse criteria demonstrate that our methods could achieve new state-of-the-art results for MCCWS

arXiv.org e-Print Archive

Pre-training with Meta Learning for Chinese Word Segmentation

Author: Ke Zhen
Meng Erli
Qiu Xipeng
Shi Liang
Sun Songtao
Wang Bin
Publication venue
Publication date: 15/03/2021
Field of study

Recent researches show that pre-trained models (PTMs) are beneficial to Chinese Word Segmentation (CWS). However, PTMs used in previous works usually adopt language modeling as pre-training tasks, lacking task-specific prior segmentation knowledge and ignoring the discrepancy between pre-training tasks and downstream CWS tasks. In this paper, we propose a CWS-specific pre-trained model METASEG, which employs a unified architecture and incorporates meta learning algorithm into a multi-criteria pre-training task. Empirical results show that METASEG could utilize common prior segmentation knowledge from different existing criteria and alleviate the discrepancy between pre-trained models and downstream CWS tasks. Besides, METASEG can achieve new state-of-the-art performance on twelve widely-used CWS datasets and significantly improve model performance in low-resource settings.Comment: Accepted by NAACL 202

arXiv.org e-Print Archive