Disclosure Risk Components of Contextualized Microdata: Identifying Unique Geographic Units and the Implications for Pinpointing Survey Respondents

Abstract

To safely respond to increased demand for microdata that contain contextual information, producers ought to consider how this data may be used to identify the location of survey respondents. This study informs the design of these datafiles with its hierarchical matching algorithm and discussion of associated methodological concerns. Compiling nearly 15,000 test datasets composed of person-records, I assess three determinants of “locational” risk, that of identifying the location of survey respondents whose contextual characteristics: (1) are rarely found among the total population of geographic units; (2) are rarely found within a survey; and (3) pose no disclosure risk given the protection offered by the area’s dense population. Using the “datafile” as my unit of analysis, the proportion of survey respondents whose locations are easily-reidentified as the outcome of interest, and indicators of different components of this risk, I detail the complexity of reidentification patterns that emerge when constructing public-use files that provide contextual data.http://deepblue.lib.umich.edu/bitstream/2027.42/58627/1/ICPSR-WP-No3-Witkowski.pd

    Similar works