2 research outputs found
Semantic-Unit-Based Dilated Convolution for Multi-Label Text Classification
We propose a novel model for multi-label text classification, which is based
on sequence-to-sequence learning. The model generates higher-level semantic
unit representations with multi-level dilated convolution as well as a
corresponding hybrid attention mechanism that extracts both the information at
the word-level and the level of the semantic unit. Our designed dilated
convolution effectively reduces dimension and supports an exponential expansion
of receptive fields without loss of local information, and the
attention-over-attention mechanism is able to capture more summary relevant
information from the source context. Results of our experiments show that the
proposed model has significant advantages over the baseline models on the
dataset RCV1-V2 and Ren-CECps, and our analysis demonstrates that our model is
competitive to the deterministic hierarchical models and it is more robust to
classifying low-frequency labels.Comment: EMNLP 201
A Deep Reinforced Sequence-to-Set Model for Multi-Label Text Classification
Multi-label text classification (MLTC) aims to assign multiple labels to each
sample in the dataset. The labels usually have internal correlations. However,
traditional methods tend to ignore the correlations between labels. In order to
capture the correlations between labels, the sequence-to-sequence (Seq2Seq)
model views the MLTC task as a sequence generation problem, which achieves
excellent performance on this task. However, the Seq2Seq model is not suitable
for the MLTC task in essence. The reason is that it requires humans to
predefine the order of the output labels, while some of the output labels in
the MLTC task are essentially an unordered set rather than an ordered sequence.
This conflicts with the strict requirement of the Seq2Seq model for the label
order. In this paper, we propose a novel sequence-to-set framework utilizing
deep reinforcement learning, which not only captures the correlations between
labels, but also reduces the dependence on the label order. Extensive
experimental results show that our proposed method outperforms the competitive
baselines by a large margin