Search CORE

7,585 research outputs found

Writing and literacy in Indonesia

Author: Lowenberg Peter
Publication venue
Publication date: 01/01/2000
Field of study

published or submitted for publicationis peer reviewe

Illinois Digital Environment for Access to Learning and Scholarship Repository

ON MONITORING LANGUAGE CHANGE WITH THE SUPPORT OF CORPUS PROCESSING

Author: Prihantoro Prihantoro
Publication venue
Publication date: 05/07/2012
Field of study

One of the fundamental characteristics of language is that it can change over time. One method to monitor the change is by observing its corpora: a structured language documentation. Recent development in technology, especially in the field of Natural Language Processing allows robust linguistic processing, which support the description of diverse historical changes of the corpora. The interference of human linguist is inevitable as it determines the gold standard, but computer assistance provides considerable support by incorporating computational approach in exploring the corpora, especially historical corpora. This paper proposes a model for corpus development, where corpus are annotated to support further computational operations such as lexicogrammatical pattern matching, automatic retrieval and extraction. The corpus processing operations are performed by local grammar based corpus processing software on a contemporary Indonesian corpus. This paper concludes that data collection and data processing in a corpus are equally crucial importance to monitor language change, and none can be set aside

Diponegoro University Institutional Repository

Natural language processing

Author: Adams
Amsler
Bangalore
Barker
Benoît
Bian
Bondale
Carrick
Ceric
Chandrasekar
Chang
Charniak
Chen
Chowdhury
Chowdhury
Costantino
Cowie
Craven
Craven
Craven
Dogru
Evans
Feldman
Fernandez
Gaizauskas
Glasgow
Haas
Hayes
Hayes
Hedlund
Herath
Ide
Isahara
Jelinek
Jeong
Jurafsky
Kazakov
Kehler
Khoo
Kim
King
Lange
Lee
Lehmam
Lehtokangas
Lewis
Liddy
Liddy
Lovis
Ma
Magnini
Mani
Manning
Marquez
Martinez
Martinez
McMurchie
Meyer
Mihalcea
Mock
Moens
Morin
Narita
Nerbonne
Oard
Ogura
Oudet
Owei
Paris
Pasero
Pedersen
Perez-Carballo
Petreley
Pirkola
Poesio
Rosenfield
Roux
Say
Scarlett
Schenker
Silber
Smeaton
Smeaton
Smith
Sokol
Song
Sparck Jones
Staab
Stock
Tolle
Trybula
Tsuda
Vickery
Waldrop
Warner
Weigard
Wilks
Wong
Yang
Yang
Zadrozny
Zweigenbaum
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

Crossref

University of Strathclyde Institutional Repository

OPUS - University of Technology Sydney

Translation into any natural language of the error messages generated by any computer program

Author: Roehner Bertrand
Publication venue
Publication date: 20/08/2015
Field of study

Since the introduction of the Fortran programming language some 60 years ago, there has been little progress in making error messages more user-friendly. A first step in this direction is to translate them into the natural language of the students. In this paper we propose a simple script for Linux systems which gives word by word translations of error messages. It works for most programming languages and for all natural languages. Understanding the error messages generated by compilers is a major hurdle for students who are learning programming, particularly for non-native English speakers. Not only may they never become "fluent" in programming but many give up programming altogether. Whereas programming is a tool which can be useful in many human activities, e.g. history, genealogy, astronomy, entomology, in many countries the skill of programming remains confined to a narrow fringe of professional programmers. In all societies, besides professional violinists there are also amateurs. It should be the same for programming. It is our hope that once translated and explained the error messages will be seen by the students as an aid rather than as an obstacle and that in this way more students will enjoy learning and practising programming. They should see it as a funny game.Comment: 14 pages, 1 figur

arXiv.org e-Print Archive

Hal-Diderot

The BURCHAK corpus: a Challenge Data Set for Interactive Learning of Visually Grounded Word Meanings

Author: Eshghi Arash
Lemon Oliver Joseph
Mills Gregory
Yu Yanchao
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

We motivate and describe a new freely available human-human dialogue dataset for interactive learning of visually grounded word meanings through ostensive definition by a tutor to a learner. The data has been collected using a novel, character-by-character variant of the DiET chat tool (Healey et al., 2003; Mills and Healey, submitted) with a novel task, where a Learner needs to learn invented visual attribute words (such as " burchak " for square) from a tutor. As such, the text-based interactions closely resemble face-to-face conversation and thus contain many of the linguistic phenomena encountered in natural, spontaneous dialogue. These include self-and other-correction, mid-sentence continuations, interruptions, overlaps, fillers, and hedges. We also present a generic n-gram framework for building user (i.e. tutor) simulations from this type of incremental data, which is freely available to researchers. We show that the simulations produce outputs that are similar to the original data (e.g. 78% turn match similarity). Finally, we train and evaluate a Reinforcement Learning dialogue control agent for learning visually grounded word meanings, trained from the BURCHAK corpus. The learned policy shows comparable performance to a rule-based system built previously.Comment: 10 pages, THE 6TH WORKSHOP ON VISION AND LANGUAGE (VL'17

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

GenERRate: generating errors for use in grammatical error detection

Author: Andersen Øistein E.
Foster Jennifer
Publication venue: The Association for Computational Linguistics
Publication date: 01/01/2009
Field of study

This paper explores the issue of automatically generated ungrammatical data and its use in error detection, with a focus on the task of classifying a sentence as grammatical or ungrammatical. We present an error generation tool called GenERRate and show how GenERRate can be used to improve the performance of a classifier on learner data. We describe initial attempts to replicate Cambridge Learner Corpus errors using GenERRate

CiteSeerX

Irish Universities

DCU Online Research Access Service

An examination of the suitability of a pluricentric model of english language teaching for primary education in Indonesia

Author: Adityarini Hepy
Publication venue: Curtin University
Publication date: 01/01/2014
Field of study

The study examined the suitability of a pluricentric model of ELT, which accommodates local varieties of English, for primary education in Indonesia. The majority of participants in the study strongly supported the adoption of a pluricentric model of English language instruction. However, whether their positive attitudes would affect ELT pedagogy was not clear, since there were many complex issues impacting on the adoption of this approach in Indonesia

espace@Curtin

On the Web Communication Assist Aide based on the Bilingual Sign Language Dictionary

Author: 垣花京子
鈴木恵美子
Publication venue: Institute of Linguistics, Academia Sinica
Publication date: 01/01/2005
Field of study

PACLIC 19 / Taipei, taiwan / December 1-3, 200

Waseda University Repository

The Pronunciation Problems among Kurdish Learners of English

Author: Ghafar Zanyar Nathir
Kurt Mustafa
Popescu Doina
Publication venue: AMO Publisher
Publication date: 28/02/2023
Field of study

The goal of this study was to examine the pronunciation issues of different speakers of English and especially Kurdish speakers, and various perspectives on native vs foreign pronunciations. The research showed that Kurdish speakers had difficulties pronouncing several English vowels and some English consonants. The research results demonstrate that Kurdish English speakers understand the value of pronunciation compared to native and non-native English speakers. Kurdish speakers may hesitate to speak in a manner that seems natural to a native speaker, and their last consonants in words are almost always unaspirated and unvoiced. Given that Kurdish learners of English have difficulty pronouncing some English words, some suggested solutions include providing pronunciation instruction classes to language instructors, having educators speak in English, and giving students examples of native tongue sounds compared and contrasted with the target language sounds. With minimal exposure to cooperation with native speakers and variations in L1's phonological organization compared to English, the difficulty posed by pronunciation is evident. All the updated studies clearly show that these issues affect English speakers in general and rely less and less on their original tongue

ZENODO

European Journal of Theoretical and Applied Sciences

Multilinguals and Wikipedia Editing

Author: Hale Scott A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

This article analyzes one month of edits to Wikipedia in order to examine the role of users editing multiple language editions (referred to as multilingual users). Such multilingual users may serve an important function in diffusing information across different language editions of the encyclopedia, and prior work has suggested this could reduce the level of self-focus bias in each edition. This study finds multilingual users are much more active than their single-edition (monolingual) counterparts. They are found in all language editions, but smaller-sized editions with fewer users have a higher percentage of multilingual users than larger-sized editions. About a quarter of multilingual users always edit the same articles in multiple languages, while just over 40% of multilingual users edit different articles in different languages. When non-English users do edit a second language edition, that edition is most frequently English. Nonetheless, several regional and linguistic cross-editing patterns are also present

arXiv.org e-Print Archive

CiteSeerX

Crossref

Oxford University Research Archive