Bilingual sentence alignment of pre-Qin history literature for digital humanities study

Abstract

Sentence aligned bilingual text of history literature provides support of digital resources for related digital humanities studies, but existing studies have done little work on sentence alignment of ancient Chinese and English. In this study, we made a preliminary attempt to align the sentence of ancient Chinese and English. We used the bilingual text of the Analects of Confucius and Zuo's Commentaries of the Spring and Autumn Annals, extracted features and adopted the classification method to divide the bilingual candidate sentence pairs based on probability scores. The bilingual sentence alignment model based on SVM had the best performance on a larger amount of data when using three features and confirmed the impact of candidate dataset

    Similar works