704 research outputs found
State-of-the-art Chinese Word Segmentation with Bi-LSTMs
A wide variety of neural-network architectures have been proposed for the
task of Chinese word segmentation.
Surprisingly, we find that a bidirectional LSTM model, when combined with
standard deep learning techniques and best practices, can achieve better
accuracy on many of the popular datasets as compared to models based on more
complex neural-network architectures.
Furthermore, our error analysis shows that out-of-vocabulary words remain
challenging for neural-network models, and many of the remaining errors are
unlikely to be fixed through architecture changes.
Instead, more effort should be made on exploring resources for further
improvement
Dual Long Short-Term Memory Networks for Sub-Character Representation Learning
Characters have commonly been regarded as the minimal processing unit in
Natural Language Processing (NLP). But many non-latin languages have
hieroglyphic writing systems, involving a big alphabet with thousands or
millions of characters. Each character is composed of even smaller parts, which
are often ignored by the previous work. In this paper, we propose a novel
architecture employing two stacked Long Short-Term Memory Networks (LSTMs) to
learn sub-character level representation and capture deeper level of semantic
meanings. To build a concrete study and substantiate the efficiency of our
neural architecture, we take Chinese Word Segmentation as a research case
example. Among those languages, Chinese is a typical case, for which every
character contains several components called radicals. Our networks employ a
shared radical level embedding to solve both Simplified and Traditional Chinese
Word Segmentation, without extra Traditional to Simplified Chinese conversion,
in such a highly end-to-end way the word segmentation can be significantly
simplified compared to the previous work. Radical level embeddings can also
capture deeper semantic meaning below character level and improve the system
performance of learning. By tying radical and character embeddings together,
the parameter count is reduced whereas semantic knowledge is shared and
transferred between two levels, boosting the performance largely. On 3 out of 4
Bakeoff 2005 datasets, our method surpassed state-of-the-art results by up to
0.4%. Our results are reproducible, source codes and corpora are available on
GitHub.Comment: Accepted & forthcoming at ITNG-201
Switch-LSTMs for Multi-Criteria Chinese Word Segmentation
Multi-criteria Chinese word segmentation is a promising but challenging task,
which exploits several different segmentation criteria and mines their common
underlying knowledge. In this paper, we propose a flexible multi-criteria
learning for Chinese word segmentation. Usually, a segmentation criterion could
be decomposed into multiple sub-criteria, which are shareable with other
segmentation criteria. The process of word segmentation is a routing among
these sub-criteria. From this perspective, we present Switch-LSTMs to segment
words, which consist of several long short-term memory neural networks (LSTM),
and a switcher to automatically switch the routing among these LSTMs. With
these auto-switched LSTMs, our model provides a more flexible solution for
multi-criteria CWS, which is also easy to transfer the learned knowledge to new
criteria. Experiments show that our model obtains significant improvements on
eight corpora with heterogeneous segmentation criteria, compared to the
previous method and single-criterion learning
- …