Privacy in context : the costs and benefits of a new deidentification method

Trepetin, Stanley

thesis

Privacy in context : the costs and benefits of a new deidentification method

Authors: Stanley Trepetin
Publication date: 1 January 2006
Publisher: Massachusetts Institute of Technology

Abstract

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (leaves 131-150).The American public continues to be concerned about medical privacy. Policy research continues to show people's demand for health organizations to protect patient-specific data. Health organizations need personally identifiable data for unhampered decision making; however, identifiable data are often the basis of information abuse if such data are improperly disclosed. This thesis shows that health organizations may use deidentified data for key routine organizational operations. I construct a technology adoption model and investigate if a for-profit health insurer could use deidentified data for key internal software quality management applications. If privacy-related data are analyzed without rigor, little support is found to incorporate more privacy protections into such applications. Legal and financial motivations appear lacking. Adding privacy safeguards to such software programs apparently doesn't improve policy-holder care quality. Existing technical approaches do not readily allow for data deidentification while permitting key computations within the applications. A closer analysis of data reaches different conclusions. I describe the bills that are currently passing through Congress to mitigate abuses of identifiable data that exist within organizations.(cont.) I create a cost and medical benefits model demonstrating the financial losses to the insurer and medical losses to its policy-holders due to less privacy protection within the routine software applications. One of the model components describes the Predictive Modeling application (PMA), used to identify an insurer's chronically-ill policy-holders. Disease management programs can enhance the care and reduce the costs of such individuals because improving such people's health can reduce costs to the paying organization. The model quantifies the decrease in care and rise in the insurer's claim costs as the PMA must work with suboptimal data due to policy-holders' privacy concerns regarding the routine software applications. I create a model for selecting variables to improve data linkage in software applications in general. An encryption-based approach, which allows for the secure linkage of records despite errors in linkage variables, is subsequently constructed. I test this approach as part of a general data deidentification method on an actual PMA used by health insurers. The PMA's performance is found to be the same as if executing on identifiable data.by Stanley Trepetin.Ph.D

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

DSpace@MIT

oai:dspace.mit.edu:1721.1/3797...

Last time updated on 11/06/2012