A Multi-Layer Graphical Model for Approximate Identity Matching

Abstract

Many organizations maintain identity information for their customers, vendors, and employees, etc. However, identities being compromised cannot be retrieved effectively. In this paper we first present a case study on identity problems existing in a local police department. The study show that more than half of the sampled suspects have altered identities existing in the police information system due to deception and errors. We build a taxonomy of identity problems based on our findings. The decision to determine matching identities involves some uncertainty because of the problems identified. We propose a probability-based multi-layer graphical model to capture the uncertainty. Experiments show that the proposed model performs significantly better than the searching technique based on exact-match. With 20% of training data labeled, the model with semi-supervised learning achieved performance comparable to that of fully supervised learning

    Similar works