We have developed a method for extracting the coherence features from a
paragraph by matching similar words in its sentences. We conducted an
experiment with a parallel German corpus containing 2000 human-created and 2000
machine-translated paragraphs. The result showed that our method achieved the
best performance (accuracy = 72.3%, equal error rate = 29.8%) when it is
compared with previous methods on various computer-generated text including
translation and paper generation (best accuracy = 67.9%, equal error rate =
32.0%). Experiments on Dutch, another rich resource language, and a low
resource one (Japanese) attained similar performances. It demonstrated the
efficiency of the coherence features at distinguishing computer-translated from
human-created paragraphs on diverse languages.Comment: 9 pages, PACLIC 201