5,327 research outputs found
Recommended from our members
The early stages in adult L2 syntax: Additional evidence from Romance speakers.
Chart-driven Connectionist Categorial Parsing of Spoken Korean
While most of the speech and natural language systems which were developed
for English and other Indo-European languages neglect the morphological
processing and integrate speech and natural language at the word level, for the
agglutinative languages such as Korean and Japanese, the morphological
processing plays a major role in the language processing since these languages
have very complex morphological phenomena and relatively simple syntactic
functionality. Obviously degenerated morphological processing limits the usable
vocabulary size for the system and word-level dictionary results in exponential
explosion in the number of dictionary entries. For the agglutinative languages,
we need sub-word level integration which leaves rooms for general morphological
processing. In this paper, we developed a phoneme-level integration model of
speech and linguistic processings through general morphological analysis for
agglutinative languages and a efficient parsing scheme for that integration.
Korean is modeled lexically based on the categorial grammar formalism with
unordered argument and suppressed category extensions, and chart-driven
connectionist parsing method is introduced.Comment: 6 pages, Postscript file, Proceedings of ICCPOL'9
Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean
A new tightly coupled speech and natural language integration model is
presented for a TDNN-based continuous possibly large vocabulary speech
recognition system for Korean. Unlike popular n-best techniques developed for
integrating mainly HMM-based speech recognition and natural language processing
in a {\em word level}, which is obviously inadequate for morphologically
complex agglutinative languages, our model constructs a spoken language system
based on a {\em morpheme-level} speech and language integration. With this
integration scheme, the spoken Korean processing engine (SKOPE) is designed and
implemented using a TDNN-based diphone recognition module integrated with a
Viterbi-based lexical decoding and symbolic phonological/morphological
co-analysis. Our experiment results show that the speaker-dependent continuous
{\em eojeol} (Korean word) recognition and integrated morphological analysis
can be achieved with over 80.6% success rate directly from speech inputs for
the middle-level vocabularies.Comment: latex source with a4 style, 15 pages, to be published in computer
processing of oriental language journa
SKOPE: A connectionist/symbolic architecture of spoken Korean processing
Spoken language processing requires speech and natural language integration.
Moreover, spoken Korean calls for unique processing methodology due to its
linguistic characteristics. This paper presents SKOPE, a connectionist/symbolic
spoken Korean processing engine, which emphasizes that: 1) connectionist and
symbolic techniques must be selectively applied according to their relative
strength and weakness, and 2) the linguistic characteristics of Korean must be
fully considered for phoneme recognition, speech and language integration, and
morphological/syntactic processing. The design and implementation of SKOPE
demonstrates how connectionist/symbolic hybrid architectures can be constructed
for spoken agglutinative language processing. Also SKOPE presents many novel
ideas for speech and language processing. The phoneme recognition,
morphological analysis, and syntactic analysis experiments show that SKOPE is a
viable approach for the spoken Korean processing.Comment: 8 pages, latex, use aaai.sty & aaai.bst, bibfile: nlpsp.bib, to be
presented at IJCAI95 workshops on new approaches to learning for natural
language processin
Treebank-based acquisition of wide-coverage, probabilistic LFG resources: project overview, results and evaluation
This paper presents an overview of a project to acquire wide-coverage, probabilistic Lexical-Functional Grammar
(LFG) resources from treebanks. Our approach is based on an automatic annotation algorithm that annotates ārawā treebank trees with LFG f-structure information approximating to basic predicate-argument/dependency structure. From the f-structure-annotated treebank
we extract probabilistic unification grammar resources. We present the annotation algorithm, the extraction of
lexical information and the acquisition of wide-coverage and robust PCFG-based LFG approximations including
long-distance dependency resolution.
We show how the methodology can be applied to multilingual, treebank-based unification grammar acquisition. Finally
we show how simple (quasi-)logical forms can be derived automatically from the f-structures generated for the treebank trees
Recommended from our members
Presenting complex ideas using simple syntax in fiction for low-literate immigrant adults
Research in the Language, Information and Computation Laboratory of the University of Pennsylvania
This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania.
It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition.
Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue itās easier than ever to do so: this document is accessible on the āinformation superhighwayā. Just call up http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html
In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authorsā abstracts in the web version of this report.
The abstracts describe the researchersā many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn
Modal Markers in Japanese: A Study of Learnersā Use before and after Study Abroad
Japanese discourse requires speakers to index, in a relatively explicit manner, their stance toward the propositional information as well as the hearer. This is done, among other things, by means of a grammaticalized set of modal markers. Although previous research suggests that the use of modal expressions by second language learners differs from that of native users, little is known about ātypicalā native or non-native behavior. This study aims (a) to delineate native and non-native usage by a quantitative examination of a broad range of Japanese modal categories, and qualitative analyses of a subset of potentially problematic categories among them, and (b) to identify possible developmental trajectories, by means of a longitudinal observation of learnersā verbal production before and after study abroad in Japan. We find that modal categories realized by non- transparent or non-salient markers (e.g., explanatory modality no da, or utterance modality sentence-final particles) pose particular challenges in spite of their relatively high availability in the input, and we discuss this finding in terms of processing constraints that arguably affect learnersā acquisition of the grammaticalized modal markers
INVESTIGATING L1 ARABIC AND L1 KOREAN ACQUISITION OF THE PASSIVE VOICE IN L2 ENGLISH
This thesis investigates how learners from specific first language (L1) groups, Arabic and Korean, use the passive voice in English, their second language (L2). This study analyzes both spoken and written classroom data from English language learners (ELLs), six L1 Korean learners and six L1 Arabic learners, over the course of three semesters at an intensive English program (IEP) in the United States. The main goals of the analysis are to identify, categorize and quantify the errors the learners make when they use the passive voice. The results indicate that there are general obstacles that all of the ELLs face, as well as patterns of use specific to each L1 group. The key finding for the L1 Korean learners is that their most common error is using the passive voice when they should use the active voice. The key finding for the L1 Arabic learners is that their most common error is not using an auxiliary verb. However, lexical learning and other common errors, such as incorrectly conjugating the auxiliary verb and past participle, and certain patterns of use, such as rarely including by-phrases, are evident in both groups. This study also found that passivizing intransitive verbs, an error thought to commonly plague ELLs when they learn the passive voice, did not occur in the participantsā production data. In light of the results, this thesis offers suggestions for future research and instructional practices in the Teaching English to Speakers of Other Languages (TESOL) field
- ā¦