Automatic translation of scientific documents in the HAL archive

Abstract

© 2012. Published by ELRA. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: http://www.lrec-conf.org/proceedings/lrec2012/pdf/703_Paper.pdfThis paper describes the development of a statistical machine translation system between French and English for scientific papers. This system will be closely integrated into the French HAL open archive, a collection of more than 100.000 scientific papers. We describe the creation of in-domain parallel and monolingual corpora, the development of a domain specific translation system with the created resources, and its adaptation using monolingual resources only. These techniques allowed us to improve a generic system by more than 10 BLEU points

    Similar works