Bilingual annotators were paid to link roughly sixteen thousand corresponding
words between on-line versions of the Bible in modern French and modern
English. These annotations are freely available to the research community from
http://www.cis.upenn.edu/~melamed . The annotations can be used for several
purposes. First, they can be used as a standard data set for developing and
testing translation lexicons and statistical translation models. Second,
researchers in lexical semantics will be able to mine the annotations for
insights about cross-linguistic lexicalization patterns. Third, the annotations
can be used in research into certain recently proposed methods for monolingual
word-sense disambiguation. This paper describes the annotated texts, the
specially-designed annotation tool, and the strategies employed to increase the
consistency of the annotations. The annotation process was repeated five times
by different annotators. Inter-annotator agreement rates indicate that the
annotations are reasonably reliable and that the method is easy to replicate