Identifying data set specific duplicate patient records

DuVall, Scott L.

Identifying data set specific duplicate patient records

Authors: Scott L. DuVall
Publication date: 1 January 2009
Publisher: University of Utah

Abstract

posterProbabilistic models are commonly used in the identification of duplicate records. These methods are usually more accurate than deterministic methods, but are exponentially more computationally complex. Thus to make them computationally feasible, they rely on deterministic blocking strategies. This project investigates how machine learning methods can be used to automatically determine an optimal blocking strategy using duplicate records already identified

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

The University of Utah: J. Willard Marriott Digital Library

oai:collections.lib.utah.edu:i...

Last time updated on 01/01/2020