2 research outputs found
Hierarchically-Refined Label Attention Network for Sequence Labeling
CRF has been used as a powerful model for statistical sequence labeling. For
neural sequence labeling, however, BiLSTM-CRF does not always lead to better
results compared with BiLSTM-softmax local classification. This can be because
the simple Markov label transition model of CRF does not give much information
gain over strong neural encoding. For better representing label sequences, we
investigate a hierarchically-refined label attention network, which explicitly
leverages label embeddings and captures potential long-term label dependency by
giving each word incrementally refined label distributions with hierarchical
attention. Results on POS tagging, NER and CCG supertagging show that the
proposed model not only improves the overall tagging accuracy with similar
number of parameters, but also significantly speeds up the training and testing
compared to BiLSTM-CRF.Comment: EMNLP 201