3 research outputs found

    Cross Linguistic Name Matching in English and Arabic: A “One to Many Mapping ” Extension of the Levenshtein Edit Distance Algorithm

    No full text
    This paper presents a solution to the problem of matching personal names in English to the same names represented in Arabic script. Standard string comparison measures perform poorly on this task due to varying transliteration conventions in both languages and the fact that Arabic script does not usually represent short vowels. Significant improvement is achieved by augmenting the classic Levenshtein edit-distance algorithm with character equivalency classes. 1 Introduction to the problem Personal names are problematic for all languag
    corecore