49 research outputs found
Towards cross-lingual alerting for bursty epidemic events
Background: Online news reports are increasingly becoming a source for event
based early warning systems that detect natural disasters. Harnessing the
massive volume of information available from multilingual newswire presents as
many challenges as opportunities due to the patterns of reporting complex
spatiotemporal events. Results: In this article we study the problem of
utilising correlated event reports across languages. We track the evolution of
16 disease outbreaks using 5 temporal aberration detection algorithms on
text-mined events classified according to disease and outbreak country. Using
ProMED reports as a silver standard, comparative analysis of news data for 13
languages over a 129 day trial period showed improved sensitivity, F1 and
timeliness across most models using cross-lingual events. We report a detailed
case study analysis for Cholera in Angola 2010 which highlights the challenges
faced in correlating news events with the silver standard. Conclusions: The
results show that automated health surveillance using multilingual text mining
has the potential to turn low value news into high value alerts if informed
choices are used to govern the selection of models and data sources. An
implementation of the C2 alerting algorithm using multilingual news is
available at the BioCaster portal http://born.nii.ac.jp/?page=globalroundup
Building Multilingual Named Entity Annotated Corpora Exploiting Parallel Corpora
Proceedings of the Workshop on Annotation and
Exploitation of Parallel Corpora AEPC 2010.
Editors: Lars Ahrenberg, Jörg Tiedemann and Martin Volk.
NEALT Proceedings Series, Vol. 10 (2010), 24-33.
© 2010 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/15893
Toponym Disambiguation in English-Lithuanian SMT System with Spatial Knowledge
Proceedings of the 18th Nordic Conference of Computational Linguistics
NODALIDA 2011.
Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa.
NEALT Proceedings Series, Vol. 11 (2011), 191-197.
© 2011 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/16955
Machine Translation Using Automatically Inferred Construction-Based Correspondence and Language Models
PACLIC 23 / City University of Hong Kong / 3-5 December 200
Community-based post-editing of machine-translated content: monolingual vs. bilingual
We carried out a machine-translation postediting pilot study with users of an IT support forum community. For both language pairs (English to German, English to French), 4 native speakers for each language were recruited. They performed monolingual and bilingual postediting tasks on machine-translated forum content. The post-edited content was evaluated using human evaluation (fluency, comprehensibility, fidelity). We found that monolingual post-editing can lead to improved fluency and comprehensibility scores similar to those achieved through bilingual post-editing, while we found that fidelity improved considerably more for the bilingual set-up. Furthermore, the performance across post-editors varied greatly and it was found that some post-editors are able to produce better quality in a monolingual set-up than others