769 research outputs found
Finite-state description, developing mental awareness
In this article, we approach finite-state description practices that must be instilled in the developer. Thoughts are presented accompanied by reference to concrete experiences with different languages and their description. We contend that finite-state description of languages leads to development in the describer-developer. This presupposes regular interaction with developers of upstream and downstream technologies. And as more languages are described, the developer learns what to choose as a starting point, hopefully with the help of a researcher, research documentation or native speaker well versed in the workings of the language. We maintain that finite-state work should serve more than one purpose or audience, and that, as linguists, we should be raising the bar by applying the knowledge of research to description, so that our understanding of the linguistic phenomena can be attested by others or proven false. We are providing a methodology for repeatable experimentation and rule making. We see that each language provides something unique, while sharing some recognizable features with other languages. We stress the necessity to avoid generating characters from epsilons and offer examples where it is possible to write rules that reduce characters to epsilons instead. We also stress the need to describe the predictable infinite set of all native phenomena, whereas the unknown and random qualities introduced through language contact cannot form a foundation for our descriptions. Finally, we call for a playful approach to phenomena in a language, because that might bring us closer to how a child would learn the language – through repetition, mistakes and self-correction.Peer reviewe
Conference Program
Proceedings of the 17th Nordic Conference of Computational Linguistics
NODALIDA 2009.
Editors: Kristiina Jokinen and Eckhard Bick.
NEALT Proceedings Series, Vol. 4 (2009), xi-xiv.
© 2009 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/9206
Suoidne-varra-bleahkka-mála-bihkka-senet-dielku 'hay-blood-ink-paint-tar-mustard-stain' -Should compounds be lexicalized in NLP?
Source at http://ceur-ws.org/Vol-2769/paper_49.pdf. CEUR Workshop Proceedings home page at http://ceur-ws.org/Vol-2769/.Lexicalizing compounds, in addition to treating them dynamically, is a key element in giving us idiomatic translations and detecting compound errors. We present and evaluate an e-dictionary (NDS) and a grammar checker (GramDivvun) for North Sámi. We achieve a coverage of 98% for NDSqueries and of 96% for compound error detection in GramDivvun
Contents
Proceedings of the 17th Nordic Conference of Computational Linguistics
NODALIDA 2009.
Editors: Kristiina Jokinen and Eckhard Bick.
NEALT Proceedings Series, Vol. 4 (2009), iii-vi.
© 2009 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/9206
Towards improving English-Latvian translation: a system comparison and a new rescoring feature
This paper presents a comparative study of two alternative approaches to statistical machine translation (SMT) and their application to
a task of English-to-Latvian translation. Furthermore, a novel feature intending to reflect the relatively free word order scheme of the
Latvian language is proposed and successfully applied on the n-best list rescoring step. Moving beyond classical automatic scores of
translation quality that are classically presented in MT research papers, we contribute presenting a manual error analysis of MT systems
output that helps to shed light on advantages and disadvantages of the SMT systems under consideration.Postprint (published version
Conference Program
Proceedings of the 18th Nordic Conference of Computational Linguistics
NODALIDA 2011.
Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa.
NEALT Proceedings Series, Vol. 11 (2011), xii-xvii.
© 2011 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/16955
Syntactic indicators of language acquisition levels in English and French written language learner corpora
There is no abstract available for this languag
- …