Search CORE

2 research outputs found

Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention

Author: Li Yanzeng
Liu Tingwen
Xue Mengge
Yu Bowen
Publication venue
Publication date: 29/04/2020
Field of study

Most Chinese pre-trained models take character as the basic unit and learn representation according to character's external contexts, ignoring the semantics expressed in the word, which is the smallest meaningful utterance in Chinese. Hence, we propose a novel word-aligned attention to exploit explicit word information, which is complementary to various character-based Chinese pre-trained language models. Specifically, we devise a pooling mechanism to align the character-level attention to the word level and propose to alleviate the potential issue of segmentation error propagation by multi-source information fusion. As a result, word and character information are explicitly integrated at the fine-tuning procedure. Experimental results on five Chinese NLP benchmark tasks demonstrate that our model could bring another significant gain over several pre-trained models.Comment: Accepted to appear at ACL 202

arXiv.org e-Print Archive

Topic Memory Networks for Short Text Classification

Author: Gao Cuiyun
King Irwin
Li Jing
Lyu Michael R.
Song Yan
Zeng Jichuan
Publication venue
Publication date: 10/09/2018
Field of study

Many classification models work poorly on short texts due to data sparsity. To address this issue, we propose topic memory networks for short text classification with a novel topic memory mechanism to encode latent topic representations indicative of class labels. Different from most prior work that focuses on extending features with external knowledge or pre-trained topics, our model jointly explores topic inference and text classification with memory networks in an end-to-end manner. Experimental results on four benchmark datasets show that our model outperforms state-of-the-art models on short text classification, meanwhile generates coherent topics.Comment: EMNLP 201

arXiv.org e-Print Archive