903 research outputs found
Mostly-Unsupervised Statistical Segmentation of Japanese Kanji Sequences
Given the lack of word delimiters in written Japanese, word segmentation is
generally considered a crucial first step in processing Japanese texts. Typical
Japanese segmentation algorithms rely either on a lexicon and syntactic
analysis or on pre-segmented data; but these are labor-intensive, and the
lexico-syntactic techniques are vulnerable to the unknown word problem. In
contrast, we introduce a novel, more robust statistical method utilizing
unsegmented training data. Despite its simplicity, the algorithm yields
performance on long kanji sequences comparable to and sometimes surpassing that
of state-of-the-art morphological analyzers over a variety of error metrics.
The algorithm also outperforms another mostly-unsupervised statistical
algorithm previously proposed for Chinese.
Additionally, we present a two-level annotation scheme for Japanese to
incorporate multiple segmentation granularities, and introduce two novel
evaluation metrics, both based on the notion of a compatible bracket, that can
account for multiple granularities simultaneously.Comment: 22 pages. To appear in Natural Language Engineerin
Hierarchical Structure in Semantic Networks of Japanese Word Associations
PACLIC 21 / Seoul National University, Seoul, Korea / November 1-3, 200
LSH-RANSAC: An Incremental Scheme for Scalable Localization
This paper addresses the problem of feature-
based robot localization in large-size environments. With recent
progress in SLAM techniques, it has become crucial for a robot
to estimate the self-position in real-time with respect to a large-
size map that can be incrementally build by other mapper
robots. Self-localization using large-size maps have been studied
in litelature, but most of them assume that a complete map
is given prior to the self-localization task. In this paper, we
present a novel scheme for robot localization as well as map
representation that can successfully work with large-size and
incremental maps. This work combines our two previous works
on incremental methods, iLSH and iRANSAC, for appearance-
based and position-based localization
Learning Chinese characters: a comparative study of the learning strategies of western students and Eastern Asian students in Taiwan
2012 Spring.Includes bibliographical references.Vocabulary acquisition is central to learning Chinese as second or foreign language. Little research has been conducted on vocabulary learning strategies in this area. Even less study has been conducted whether students from different native language background would apply vocabulary learning strategies differently. The present study was designed to address this gap. The major concern of this study was to explore whether students from Western alphabetic countries and students from Eastern Asian countries would apply different vocabulary learning strategies in Chinese vocabulary acquisition. All the participants are international students who currently reside in Taiwan and attending the same American School located in Taipei, Taiwan. Learning Chinese is mandatory in the school. An on line survey instrument was used to collect data from the students. Descriptive statistics were used. An independent samples t-test was used to assess whether students of different native language background showed significant differences in the application of vocabulary learning strategies. No significant difference was found, however, suggestions regarding curricula design in learning Chinese vocabularies were made based on the tentative findings of this study
- …