592 research outputs found

    Avoiding disclosure of individually identifiable health information: a literature review

    Get PDF
    Achieving data and information dissemination without arming anyone is a central task of any entity in charge of collecting data. In this article, the authors examine the literature on data and statistical confidentiality. Rather than comparing the theoretical properties of specific methods, they emphasize the main themes that emerge from the ongoing discussion among scientists regarding how best to achieve the appropriate balance between data protection, data utility, and data dissemination. They cover the literature on de-identification and reidentification methods with emphasis on health care data. The authors also discuss the benefits and limitations for the most common access methods. Although there is abundant theoretical and empirical research, their review reveals lack of consensus on fundamental questions for empirical practice: How to assess disclosure risk, how to choose among disclosure methods, how to assess reidentification risk, and how to measure utility loss.public use files, disclosure avoidance, reidentification, de-identification, data utility

    Multiplicative noise for masking numerical microdata with constraints

    Get PDF
    Before releasing databases which contain sensitive information about individuals, statistical agencies have to apply Statistical Disclosure Limitation (SDL) methods to such data. The goal of these methods is to minimize the risk of disclosure of the confidential information and at the same time provide legitimate data users with accurate information about the population of interest. SDL methods applicable to the microdata (i.e. collection of individual records) are often called masking methods. In this paper, several multiplicative noise masking schemes are presented. These schemes are designed to preserve positivity and inequality constraints in the data together with the vector of means and covariance matrix

    Assessing the disclosure protection provided by misclassification for survey microdata

    No full text
    Government statistical agencies often apply statistical disclosure limitation techniques to survey microdata to protect confidentiality. There is a need for ways to assess the protection provided. This paper develops some simple methods for disclosure limitation techniques which perturb the values of categorical identifying variables. The methods are applied in numerical experiments based upon census data from the United Kingdom which are subject to two perturbation techniques: data swapping and the post randomisation method. Some simplifying approximations to the measure of risk are found to work well in capturing the impacts of these techniques. These approximations provide simple extensions of existing risk assessment methods based upon Poisson log-linear models. A numerical experiment is also undertaken to assess the impact of multivariate misclassification with an increasing number of identifying variables. The methods developed in this paper may also be used to obtain more realistic assessments of risk which take account of the kinds of measurement and other non-sampling errors commonly arising in surveys

    Proceedings from the Synthetic LBD International Seminar

    Get PDF
    On May 9, 2017, we hosted a seminar to discuss the conditions necessary to im- plement the SynLBD approach with interested parties, with the goal of providing a straightforward toolkit to implement the same procedure on other data. The proceed- ings summarize the discussions during the workshop
    corecore