5 research outputs found
Filtered Semi-Markov CRF
Semi-Markov CRF has been proposed as an alternative to the traditional Linear
Chain CRF for text segmentation tasks such as Named Entity Recognition (NER).
Unlike CRF, which treats text segmentation as token-level prediction, Semi-CRF
considers segments as the basic unit, making it more expressive. However,
Semi-CRF suffers from two major drawbacks: (1) quadratic complexity over
sequence length, as it operates on every span of the input sequence, and (2)
inferior performance compared to CRF for sequence labeling tasks like NER. In
this paper, we introduce Filtered Semi-Markov CRF, a variant of Semi-CRF that
addresses these issues by incorporating a filtering step to eliminate
irrelevant segments, reducing complexity and search space. Our approach is
evaluated on several NER benchmarks, where it outperforms both CRF and Semi-CRF
while being significantly faster. The implementation of our method is available
on \href{https://github.com/urchade/Filtered-Semi-Markov-CRF}{Github}.Comment: EMNLP 2023 (Findings