Search CORE

26,526 research outputs found

An integrated approach for Chinese word segmentation

Author: Fu Guohong
Luke K.K
Publication venue: COLIPS PUBLICATIONS
Publication date: 01/01/2003
Field of study

Waseda University Repository

HKU Scholars Hub

Bilingually motivated domain-adapted word segmentation for statistical machine translation

Author: Ma Yanjun
Way Andy
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2009
Field of study

We introduce a word segmentation approach to languages where word boundaries are not orthographically marked, with application to Phrase-Based Statistical Machine Translation (PB-SMT). Instead of using manually segmented monolingual domain-specific corpora to train segmenters, we make use of bilingual corpora and statistical word alignment techniques. First of all, our approach is adapted for the specific translation task at hand by taking the corresponding source (target) language into account. Secondly, this approach does not rely on manually segmented training data so that it can be automatically adapted for different domains. We evaluate the performance of our segmentation approach on PB-SMT tasks from two domains and demonstrate that our approach scores consistently among the best results across different data conditions

DCU Online Research Access Service

Fast and Accurate Neural Word Segmentation for Chinese

Author: Cai Deng
Huang Feiyue
Wu Yongjian
Xin Yuan
Zhang Zhisong
Zhao Hai
Publication venue
Publication date: 01/01/2017
Field of study

Neural models with minimal feature engineering have achieved competitive performance against traditional methods for the task of Chinese word segmentation. However, both training and working procedures of the current neural models are computationally inefficient. This paper presents a greedy neural word segmenter with balanced word and character embedding inputs to alleviate the existing drawbacks. Our segmenter is truly end-to-end, capable of performing segmentation much faster and even more accurate than state-of-the-art neural models on Chinese benchmark datasets.Comment: To appear in ACL201

arXiv.org e-Print Archive

Crossref

Effects of writing systems on second language awareness: Word awareness in English learners of Chinese as a Foreign Language.

Author: Bassetti Benedetta
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2005
Field of study

Crossref

University of Birmingham Research Portal

Birkbeck Institutional Research Online