3 research outputs found
Cross Linguistic Name Matching in English and Arabic: A âOne to Many Mapping â Extension of the Levenshtein Edit Distance Algorithm
This paper presents a solution to the problem of matching personal names in English to the same names represented in Arabic script. Standard string comparison measures perform poorly on this task due to varying transliteration conventions in both languages and the fact that Arabic script does not usually represent short vowels. Significant improvement is achieved by augmenting the classic Levenshtein edit-distance algorithm with character equivalency classes. 1 Introduction to the problem Personal names are problematic for all languag