48,725 research outputs found
Morphological annotation of Korean with Directly Maintainable Resources
This article describes an exclusively resource-based method of morphological
annotation of written Korean text. Korean is an agglutinative language. Our
annotator is designed to process text before the operation of a syntactic
parser. In its present state, it annotates one-stem words only. The output is a
graph of morphemes annotated with accurate linguistic information. The
granularity of the tagset is 3 to 5 times higher than usual tagsets. A
comparison with a reference annotated corpus showed that it achieves 89% recall
without any corpus training. The language resources used by the system are
lexicons of stems, transducers of suffixes and transducers of generation of
allomorphs. All can be easily updated, which allows users to control the
evolution of the performances of the system. It has been claimed that
morphological annotation of Korean text could only be performed by a
morphological analysis module accessing a lexicon of morphemes. We show that it
can also be performed directly with a lexicon of words and without applying
morphological rules at annotation time, which speeds up annotation to 1,210
word/s. The lexicon of words is obtained from the maintainable language
resources through a fully automated compilation process
Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean
A new tightly coupled speech and natural language integration model is
presented for a TDNN-based continuous possibly large vocabulary speech
recognition system for Korean. Unlike popular n-best techniques developed for
integrating mainly HMM-based speech recognition and natural language processing
in a {\em word level}, which is obviously inadequate for morphologically
complex agglutinative languages, our model constructs a spoken language system
based on a {\em morpheme-level} speech and language integration. With this
integration scheme, the spoken Korean processing engine (SKOPE) is designed and
implemented using a TDNN-based diphone recognition module integrated with a
Viterbi-based lexical decoding and symbolic phonological/morphological
co-analysis. Our experiment results show that the speaker-dependent continuous
{\em eojeol} (Korean word) recognition and integrated morphological analysis
can be achieved with over 80.6% success rate directly from speech inputs for
the middle-level vocabularies.Comment: latex source with a4 style, 15 pages, to be published in computer
processing of oriental language journa
SKOPE: A connectionist/symbolic architecture of spoken Korean processing
Spoken language processing requires speech and natural language integration.
Moreover, spoken Korean calls for unique processing methodology due to its
linguistic characteristics. This paper presents SKOPE, a connectionist/symbolic
spoken Korean processing engine, which emphasizes that: 1) connectionist and
symbolic techniques must be selectively applied according to their relative
strength and weakness, and 2) the linguistic characteristics of Korean must be
fully considered for phoneme recognition, speech and language integration, and
morphological/syntactic processing. The design and implementation of SKOPE
demonstrates how connectionist/symbolic hybrid architectures can be constructed
for spoken agglutinative language processing. Also SKOPE presents many novel
ideas for speech and language processing. The phoneme recognition,
morphological analysis, and syntactic analysis experiments show that SKOPE is a
viable approach for the spoken Korean processing.Comment: 8 pages, latex, use aaai.sty & aaai.bst, bibfile: nlpsp.bib, to be
presented at IJCAI95 workshops on new approaches to learning for natural
language processin
- …