Search CORE

5,399 research outputs found

ON MONITORING LANGUAGE CHANGE WITH THE SUPPORT OF CORPUS PROCESSING

Author: Prihantoro Prihantoro
Publication venue
Publication date: 05/07/2012
Field of study

One of the fundamental characteristics of language is that it can change over time. One method to monitor the change is by observing its corpora: a structured language documentation. Recent development in technology, especially in the field of Natural Language Processing allows robust linguistic processing, which support the description of diverse historical changes of the corpora. The interference of human linguist is inevitable as it determines the gold standard, but computer assistance provides considerable support by incorporating computational approach in exploring the corpora, especially historical corpora. This paper proposes a model for corpus development, where corpus are annotated to support further computational operations such as lexicogrammatical pattern matching, automatic retrieval and extraction. The corpus processing operations are performed by local grammar based corpus processing software on a contemporary Indonesian corpus. This paper concludes that data collection and data processing in a corpus are equally crucial importance to monitor language change, and none can be set aside

Diponegoro University Institutional Repository

Joining hands: developing a sign language machine translation system with and for the deaf community

Author: Morrissey Sara
Way Andy
Publication venue
Publication date: 01/01/2007
Field of study

This paper discusses the development of an automatic machine translation (MT) system for translating spoken language text into signed languages (SLs). The motivation for our work is the improvement of accessibility to airport information announcements for D/deaf and hard of hearing people. This paper demonstrates the involvement of Deaf colleagues and members of the D/deaf community in Ireland in three areas of our research: the choice of a domain for automatic translation that has a practical use for the D/deaf community; the human translation of English text into Irish Sign Language (ISL) as well as advice on ISL grammar and linguistics; and the importance of native ISL signers as manual evaluators of our translated output

CiteSeerX

Irish Universities

DCU Online Research Access Service

ClinkNotes: Towards a Corpus-Based, Machine-Aided Programme of Translation Teaching

Author: Yip Po-Ching
Zhu Chunshen
Publication venue: 'Consortium Erudit'
Publication date: 01/01/2010
Field of study

Le présent article fait l’état des lieux d’un projet pilote relatif à la création d’une plateforme conçue pour l’enseignement de la traduction ou la formation bilingue, à grande échelle, aux études supérieures. Bien que les premiers textes utilisés dans le cadre du projet soient en anglais et en chinois, le programme, ClinkNotes, offre la possibilité de prendre en charge des corpus parallèles de n’importe quelle paire de langues. L’article débute par un bref survol de l’application des corpus à la traductologie en lien avec la formation professionnelle en traduction. Puis les caractéristiques du programme (cadre théorique, méthode d’annotation et fonctionnement) sont présentées, ainsi que la manière dont il comble les impératifs pressants de la profession. Les perspectives futures d’amélioration du programme sont également discutées.This article presents a report on a pilot project designed to construct a platform for large-scale teaching of translation or bilingual training at tertiary level. The programme, ClinkNotes, has the potential of accommodating parallel corpora of any language pairs, although the primary data used in this project are in English and Chinese. The report begins with a brief overview of the development of corpus-based approach to translation studies in relation to that of translation teaching as a profession. It then proceeds to describe the actual design (i.e., the theoretical framework, the methodology of annotation, and the simple execution of the software programme), and how it helps to cater to the pressing needs of the profession. The prospects of further development of the programme are also discussed

Crossref

Érudit

The 'Humour' element in engineering lectures across cultures:An approach to pragmatic annotation

Author: Alsop Sian
Publication venue: 'Brill'
Publication date: 04/07/2016
Field of study

Coventry University Pure Portal

Elaboration of a RST Chinese Treebank

Author: Cao Shuyuan
Publication venue
Publication date: 20/03/2018
Field of study

[EN] As a subfield of Artificial Intelligence (AI), Natural Language Processing (NLP) aims to automatically process human languages. Fruitful achievements of variant studies from different research fields for NLP exist. Among these research fields, discourse analysis is becoming more and more popular. Discourse information is crucial for NLP studies. As the most spoken language in the world, Chinese occupy a very important position in NLP analysis. Therefore, this work aims to present a discourse treebank for Chinese, whose theoretical framework is Rhetorical Structure Theory (RST) (Mann and Thompson, 1988). In this work, 50 Chinese texts form the research corpus and the corpus can be consulted from the following aspects: segmentation, central unit (CU) and discourse structure. Finally, we create an open online interface for the Chinese treebank.[EU] Adimen Artifizialaren (AA) barneko arlo bat izanez, Hizkuntzaren Prozesamenduak (HP) giza-hizkuntzak automatikoko prozesatzea du helburu. Arlo horretako ikasketa anitzetan lorpen emankor asko eman dira. Ikasketa-arlo ezberdin horien artean, diskurtso-analisia gero eta ezagunagoa da. Diskurtsoko inforamzioa interes handikoa da HPko ikasketetan. Munduko hiztun gehien duen hizkuntza izanda, txinera aztertzea oso garrantzitsua da HPan egiten ari diren ikasketetarako. Hori dela eta, lan honek txinerako diskurtso-egituraz etiketaturiko zuhaitz-banku bat aurkeztea du helburu, Egitura Erretorikoaren Teoria (EET) (Mann eta Thompson, 1988) oinarrituta. Lan honetan, ikerketa-corpusa 50 testu txinatarrez osatu da, ea zuhaitz-bankua hiru etiketatze-mailatan aurkeztuko da: segmentazioa, unitate zentrala (UZ) eta diskurtso-egitura. Azkenik, corpusa webgune batean argitaratu da zuhaitz-bankua kontsultatzeko

Archivo Digital para la Docencia y la Investigación