1 research outputs found
Real-time Automatic Word Segmentation for User-generated Text
For readability and possibly for disambiguation, appropriate word
segmentation is recommended for written text. In this paper, we propose a
real-time assistive technology that utilizes an automatic segmentation. The
language investigated is Korean, a head-final language with various
morpho-syllabic blocks as characters. The training scheme is fully neural
network-based and straightforward. Besides, we show how the proposed system can
be utilized in a web-based real-time revision for a user-generated text. With
qualitative and quantitative comparison with widely used text processing
toolkits, we show the reliability of the proposed system and how it fits with
conversation-style and non-canonical texts. The demonstration is available
online.Comment: 8 pages, 4 figures, 1 tabl