Search CORE

5 research outputs found

Searching by approximate personal-name matching

Author: Camps Pare Rafael
Daude Ventura Jordi
Publication venue
Publication date: 01/01/2003
Field of study

We discuss the design, building and evaluation of a method to access theinformation of a person, using his name as a search key, even if it has deformations. We present a similarity function, the DEA function, based on the probabilities of the edit operations accordingly to the involved letters and their position, and using a variable threshold. The efficacy of DEA is quantitatively evaluated, without human relevance judgments, very superior to the efficacy of known methods. A very efficient approximate search technique for the DEA function is also presented based on a compacted trie-tree structure.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Funciones de comparación de carácteres para APNM: la distancia DEA

Author: Camps Pare Rafael
Publication venue
Publication date: 01/01/2002
Field of study

A typical application of the ASM (Approximate String Matching) is the matching of personal names, as for example to search people in the DB of an Information System. Through the years, several similarity functions have been proposed:phonetic codes, simple edit distance, n-gram distances, etc.A typical application of the ASM (Approximate String Matching) is the matching of personal names, as for example to search people in the DB of an Information System. Through the years, several similarity functions have been proposed: phonetic codes, simple edit distance, n-gram distances, etc. In this report a function is presented, DEA, having substantially better efficacy than existing ones, and mainly oriented to spanish surnames. The DEA distance is an edit distance, with costs based on the probabilities of the operations, characters and positions. The distance threshold is defined as a function of the lenght of the string. The efficacy of DEA is evaluated objectively, without human relevance judgements.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC