1 research outputs found
Transformer-GCRF: Recovering Chinese Dropped Pronouns with General Conditional Random Fields
Pronouns are often dropped in Chinese conversations and recovering the
dropped pronouns is important for NLP applications such as Machine Translation.
Existing approaches usually formulate this as a sequence labeling task of
predicting whether there is a dropped pronoun before each token and its type.
Each utterance is considered to be a sequence and labeled independently.
Although these approaches have shown promise, labeling each utterance
independently ignores the dependencies between pronouns in neighboring
utterances. Modeling these dependencies is critical to improving the
performance of dropped pronoun recovery. In this paper, we present a novel
framework that combines the strength of Transformer network with General
Conditional Random Fields (GCRF) to model the dependencies between pronouns in
neighboring utterances. Results on three Chinese conversation datasets show
that the Transformer-GCRF model outperforms the state-of-the-art dropped
pronoun recovery models. Exploratory analysis also demonstrates that the GCRF
did help to capture the dependencies between pronouns in neighboring
utterances, thus contributes to the performance improvements.Comment: Accept as EMNLP-findings 202