6,419 research outputs found
TRANSDUCER FOR AUTO-CONVERT OF ARCHAIC TO PRESENT DAY ENGLISH FOR MACHINE READABLE TEXT: A SUPPORT FOR COMPUTER ASSISTED LANGUAGE LEARNING
There exist some English literary works where some archaic words are still used; they are relatively distinct from Present Day English (PDE). We might observe some archaic words that have undergone regular changing patterns: for instances, archaic modal verbs like mightst, darest, wouldst. The –st ending historically disappears, resulting on might, dare and would. (wouldst > would). However, some archaic words undergo distinct processes, resulting on unpredictable pattern; The occurrence frequency for archaic english pronouns like thee ‘you’, thy ‘your’, thyself ‘yourself’ are quite high. Students that are Non-Native speakers of English might come across many difficulties when they encounter English texts which include these kinds of archaic words. How might computer be a help for the student? This paper aims on providing some supports from the perspective of Computer Assisted Language Learning (CALL). It proposes some designs of lexicon transducers by using Local Grammar Graphs (LGG) for auto-convert of the archaic words to PDE in a literature machine readable text. The transducer is applied to a machine readable text that is taken from Sir Walter Scott’s Ivanhoe. The archaic words in the corpus can be converted automatically to PDE. The transducer also allows the presentation of the two forms (Arhaic and PDE), the PDE lexicons-only, or the original (Archaic Lexicons) form-only. This will help students in understanding English literature works better. All the linguistic resources here are machine readable, ready to use, maintainable and open for further development. The method might be adopted for lexicon tranducer for another language too
ON MONITORING LANGUAGE CHANGE WITH THE SUPPORT OF CORPUS PROCESSING
One of the fundamental characteristics of language is that it can change over time. One
method to monitor the change is by observing its corpora: a structured language
documentation. Recent development in technology, especially in the field of Natural
Language Processing allows robust linguistic processing, which support the description of
diverse historical changes of the corpora. The interference of human linguist is inevitable as
it determines the gold standard, but computer assistance provides considerable support by
incorporating computational approach in exploring the corpora, especially historical
corpora. This paper proposes a model for corpus development, where corpus are annotated
to support further computational operations such as lexicogrammatical pattern matching,
automatic retrieval and extraction. The corpus processing operations are performed by local
grammar based corpus processing software on a contemporary Indonesian corpus. This
paper concludes that data collection and data processing in a corpus are equally crucial
importance to monitor language change, and none can be set aside
Recognition techniques for online Arabic handwriting recognition systems
Online recognition of Arabic handwritten text has been an on-going research problem for many years. Generally,
online text recognition field has been gaining more interest
lately due to the increasing popularity of hand-held computers, digital notebooks and advanced cellular phones. However, different techniques have been used to build several online handwritten recognition systems for Arabic text, such as Neural Networks, Hidden Markov Model, Template Matching and others. Most of the researches on online text recognition have divided the recognition system into these three main phases which are preprocessing phase, feature extraction phase and recognition phase which considers as the most important phase and the heart of the whole system. This paper presents and compares techniques that have been used to recognize the Arabic handwriting scripts in online recognition systems. Those techniques attempt to recognize Arabic handwritten words, characters, digits or strokes. The structure and strategy of those reviewed techniques are explained in this article. The strengths and weaknesses of using these techniques will also be discussed
Toward a Reconceptualization of the Integration of Culture and Language in the Korean EFL Classroom.
Many Korean EFL (English as a Foreign Language) students do not have sufficient opportunity to learn cultural knowledge and information in their classrooms. EFL teachers also tend to ignore the teaching of culture. Even though culture is taught, it simply delivers fact-only information rather than cultural awareness by comparing native with target culture. Teaching target cultural knowledge and information should be delivered within the native cultural frame, and teaching of culture must be an integral part of teaching and learning English. For the effective integration of culture in EFL classes apart from conveying simply fact-only information, this research advocates the de-emphasizing cultural inequality, English-only instruction, linguistic-oriented instruction, and unoism stemming from a single cultural perspective. The research was guided by five questions: For each of the four independent variables (cultural inequality, English-only instruction, linguistic-oriented instruction, and unoism) what is the degree of the variable and its relation to integration of culture. For the dependent variable (integration of culture), what is the degree of the variable, and what is the order of priority among independent variables as affect integration of culture. The research methodology was qualitative and quantitative in design. Quantitative data was gathered from 83 Korean EFL teachers and 286 EFL students by questionnaire. Qualitative data was gathered from the free written remarks from teachers and students, interviews with 13 EFL teachers (both native and nonnative speakers), and classroom observations of the 13 EFL teachers who completed the interviews. Findings indicated that three of these independent variables (cultural inequality, English-only instruction, and unoism) were significantly and inversely related to integration of culture. However, EFL teachers and students in their qualitative data argued that linguistic-oriented instruction should be de-emphasized in the classrooms. Four pedagogical implications are revealed: (1) intercultural equality from the viewpoint of currere, as an alternative to cultural inequality; (2) bilingual instruction in EFL classroom as an alternative to English-only instruction; (3) integration of culture and language as an alternative to linguistic-oriented instruction; and finally (4) multicultural perspectives based on the cross-cultural understanding as an alternative to unoism. Eleven basic applications are suggested: (1) globalized or localized EFL curriculum, (2) cultural instruction though comparing and contrasting the native culture with the target culture within the native cultural frame, (3) open discussion, (4) literacy, (5) teachers training programs, (6) curriculum for the cultural integration, (7) English texts, (8) creativity, (9) de-centering of the binary opposites, (10) the de-emphasis of the fact-only approach, and (11) portfolio assessment
A Corpus-Based Investigation of Definite Description Use
We present the results of a study of definite descriptions use in written
texts aimed at assessing the feasibility of annotating corpora with information
about definite description interpretation. We ran two experiments, in which
subjects were asked to classify the uses of definite descriptions in a corpus
of 33 newspaper articles, containing a total of 1412 definite descriptions. We
measured the agreement among annotators about the classes assigned to definite
descriptions, as well as the agreement about the antecedent assigned to those
definites that the annotators classified as being related to an antecedent in
the text. The most interesting result of this study from a corpus annotation
perspective was the rather low agreement (K=0.63) that we obtained using
versions of Hawkins' and Prince's classification schemes; better results
(K=0.76) were obtained using the simplified scheme proposed by Fraurud that
includes only two classes, first-mention and subsequent-mention. The agreement
about antecedents was also not complete. These findings raise questions
concerning the strategy of evaluating systems for definite description
interpretation by comparing their results with a standardized annotation. From
a linguistic point of view, the most interesting observations were the great
number of discourse-new definites in our corpus (in one of our experiments,
about 50% of the definites in the collection were classified as discourse-new,
30% as anaphoric, and 18% as associative/bridging) and the presence of
definites which did not seem to require a complete disambiguation.Comment: 47 pages, uses fullname.sty and palatino.st
- …