40,028 research outputs found

    Errors in inflectional morphemes as an index of linguistic competence of Korean Heritage language learners and American learners of Korean

    Get PDF
    This study examined the linguistic competence in Korean of Korean heritage language learners (HLLs), compared to English-speaking non-heritage language learners (NHLLs) of Korean. It is unclear and controversial as to whether heritage languages learners are exposed to early but are interrupted manifest as L1 competence or share more characteristics with development in L2/FL competence. However, a common misconception is that HLLs outperform NHLLs in overall language skills even though Korean HLLs in Korean as a Foreign Language (KFL) classes do not make better progress than NHLLs despite their comparatively stronger aural interpretive abilities. This study was designed to investigate whether HLLs have an advantage over NHLLs in learning distinctive parametric values in Korean language, through comparing occurrences and sources of grammatical errors exhibited by two groups taking university-level KFL classes. This study addresses Korean inflectional morphemes, with a focus on case and postposition markers and affixal connectives. Data was collected from error analysis (EA) of inflectional morpheme errors and its source on semi-guided and self-generated writing samples, and grammaticality judgment in a word completion (GJWC) test using the same inflectional morphemes used for the EA. Schlyter's Weak language (WL) as L2, Montrul's WL as L1, and the Missing Surface Inflection Hypothesis (MSIH) provided theoretical frameworks. The EA data was coded using the Systematic Analysis of Language Transcript program. The EA and GJWC data were analyzed using a 2-way ANOVA and, when there was a significant interaction effect between heritage status and language proficiency level, a 1-way ANOVA. This study's results confirmed Schlyter's hypothesis, but did not support Montrul's hypothesis from either the EA or GJWC. MSIH failed in explaining underlying linguistic competence of HLLs. Significantly higher error rates caused by omitting necessary subject and object markers among HLLs imply their Korean morphological data stays at the level of Korean child's morphology. Significantly higher error rates in instrument marker in the GJWC test by advanced level of HLLs imply impaired Korean morphology of HLLs. Linguistic variation is more prominent among HLL group. Findings are further discussed in relation to their theoretical, methodological, and pedagogical implications. Differentiated instructional and curricular approaches for HLL and NHLL groups are suggested

    SKOPE: A connectionist/symbolic architecture of spoken Korean processing

    Full text link
    Spoken language processing requires speech and natural language integration. Moreover, spoken Korean calls for unique processing methodology due to its linguistic characteristics. This paper presents SKOPE, a connectionist/symbolic spoken Korean processing engine, which emphasizes that: 1) connectionist and symbolic techniques must be selectively applied according to their relative strength and weakness, and 2) the linguistic characteristics of Korean must be fully considered for phoneme recognition, speech and language integration, and morphological/syntactic processing. The design and implementation of SKOPE demonstrates how connectionist/symbolic hybrid architectures can be constructed for spoken agglutinative language processing. Also SKOPE presents many novel ideas for speech and language processing. The phoneme recognition, morphological analysis, and syntactic analysis experiments show that SKOPE is a viable approach for the spoken Korean processing.Comment: 8 pages, latex, use aaai.sty & aaai.bst, bibfile: nlpsp.bib, to be presented at IJCAI95 workshops on new approaches to learning for natural language processin

    Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean

    Full text link
    A new tightly coupled speech and natural language integration model is presented for a TDNN-based continuous possibly large vocabulary speech recognition system for Korean. Unlike popular n-best techniques developed for integrating mainly HMM-based speech recognition and natural language processing in a {\em word level}, which is obviously inadequate for morphologically complex agglutinative languages, our model constructs a spoken language system based on a {\em morpheme-level} speech and language integration. With this integration scheme, the spoken Korean processing engine (SKOPE) is designed and implemented using a TDNN-based diphone recognition module integrated with a Viterbi-based lexical decoding and symbolic phonological/morphological co-analysis. Our experiment results show that the speaker-dependent continuous {\em eojeol} (Korean word) recognition and integrated morphological analysis can be achieved with over 80.6% success rate directly from speech inputs for the middle-level vocabularies.Comment: latex source with a4 style, 15 pages, to be published in computer processing of oriental language journa

    Chart-driven Connectionist Categorial Parsing of Spoken Korean

    Full text link
    While most of the speech and natural language systems which were developed for English and other Indo-European languages neglect the morphological processing and integrate speech and natural language at the word level, for the agglutinative languages such as Korean and Japanese, the morphological processing plays a major role in the language processing since these languages have very complex morphological phenomena and relatively simple syntactic functionality. Obviously degenerated morphological processing limits the usable vocabulary size for the system and word-level dictionary results in exponential explosion in the number of dictionary entries. For the agglutinative languages, we need sub-word level integration which leaves rooms for general morphological processing. In this paper, we developed a phoneme-level integration model of speech and linguistic processings through general morphological analysis for agglutinative languages and a efficient parsing scheme for that integration. Korean is modeled lexically based on the categorial grammar formalism with unordered argument and suppressed category extensions, and chart-driven connectionist parsing method is introduced.Comment: 6 pages, Postscript file, Proceedings of ICCPOL'9

    Morphological annotation of Korean with Directly Maintainable Resources

    Get PDF
    This article describes an exclusively resource-based method of morphological annotation of written Korean text. Korean is an agglutinative language. Our annotator is designed to process text before the operation of a syntactic parser. In its present state, it annotates one-stem words only. The output is a graph of morphemes annotated with accurate linguistic information. The granularity of the tagset is 3 to 5 times higher than usual tagsets. A comparison with a reference annotated corpus showed that it achieves 89% recall without any corpus training. The language resources used by the system are lexicons of stems, transducers of suffixes and transducers of generation of allomorphs. All can be easily updated, which allows users to control the evolution of the performances of the system. It has been claimed that morphological annotation of Korean text could only be performed by a morphological analysis module accessing a lexicon of morphemes. We show that it can also be performed directly with a lexicon of words and without applying morphological rules at annotation time, which speeds up annotation to 1,210 word/s. The lexicon of words is obtained from the maintainable language resources through a fully automated compilation process

    A Syllable-based Technique for Word Embeddings of Korean Words

    Full text link
    Word embedding has become a fundamental component to many NLP tasks such as named entity recognition and machine translation. However, popular models that learn such embeddings are unaware of the morphology of words, so it is not directly applicable to highly agglutinative languages such as Korean. We propose a syllable-based learning model for Korean using a convolutional neural network, in which word representation is composed of trained syllable vectors. Our model successfully produces morphologically meaningful representation of Korean words compared to the original Skip-gram embeddings. The results also show that it is quite robust to the Out-of-Vocabulary problem.Comment: 5 pages, 3 figures, 1 table. Accepted for EMNLP 2017 Workshop - The 1st Workshop on Subword and Character level models in NLP (SCLeM
    corecore