27,637 research outputs found

    Assessing the disclosure protection provided by misclassification for survey microdata

    No full text
    Government statistical agencies often apply statistical disclosure limitation techniques to survey microdata to protect confidentiality. There is a need for ways to assess the protection provided. This paper develops some simple methods for disclosure limitation techniques which perturb the values of categorical identifying variables. The methods are applied in numerical experiments based upon census data from the United Kingdom which are subject to two perturbation techniques: data swapping and the post randomisation method. Some simplifying approximations to the measure of risk are found to work well in capturing the impacts of these techniques. These approximations provide simple extensions of existing risk assessment methods based upon Poisson log-linear models. A numerical experiment is also undertaken to assess the impact of multivariate misclassification with an increasing number of identifying variables. The methods developed in this paper may also be used to obtain more realistic assessments of risk which take account of the kinds of measurement and other non-sampling errors commonly arising in surveys

    Avoiding disclosure of individually identifiable health information: a literature review

    Get PDF
    Achieving data and information dissemination without arming anyone is a central task of any entity in charge of collecting data. In this article, the authors examine the literature on data and statistical confidentiality. Rather than comparing the theoretical properties of specific methods, they emphasize the main themes that emerge from the ongoing discussion among scientists regarding how best to achieve the appropriate balance between data protection, data utility, and data dissemination. They cover the literature on de-identification and reidentification methods with emphasis on health care data. The authors also discuss the benefits and limitations for the most common access methods. Although there is abundant theoretical and empirical research, their review reveals lack of consensus on fundamental questions for empirical practice: How to assess disclosure risk, how to choose among disclosure methods, how to assess reidentification risk, and how to measure utility loss.public use files, disclosure avoidance, reidentification, de-identification, data utility

    Quantifying Privacy: A Novel Entropy-Based Measure of Disclosure Risk

    Full text link
    It is well recognised that data mining and statistical analysis pose a serious treat to privacy. This is true for financial, medical, criminal and marketing research. Numerous techniques have been proposed to protect privacy, including restriction and data modification. Recently proposed privacy models such as differential privacy and k-anonymity received a lot of attention and for the latter there are now several improvements of the original scheme, each removing some security shortcomings of the previous one. However, the challenge lies in evaluating and comparing privacy provided by various techniques. In this paper we propose a novel entropy based security measure that can be applied to any generalisation, restriction or data modification technique. We use our measure to empirically evaluate and compare a few popular methods, namely query restriction, sampling and noise addition.Comment: 20 pages, 4 figure

    Pragmatic trials

    Get PDF
    No abstract available

    "Whose data is it anyway?" The implications of putting small area-level health and social data online

    Get PDF
    International audienceThe planetary exospheres are poorly known in their outer parts, since the neutral densities are low compared with the instruments detection capabilities. The exospheric models are thus often the main source of information at such high altitudes. We present a new way to take into account analytically the additional effect of the radiation pressure on planetary exospheres. In a series of papers, we present with an Hamiltonian approach the effect of the radiation pressure on dynamical trajectories, density profiles and escaping thermal flux. Our work is a generalization of the study by Bishop and Chamberlain (1989). In this second part of our work, we present here the density profiles of atomic Hydrogen in planetary exospheres subject to the radiation pressure. We first provide the altitude profiles of ballistic particles (the dominant exospheric population in most cases), which exhibit strong asymmetries that explain the known geotail phenomenon at Earth. The radiation pressure strongly enhances the densities compared with the pure gravity case (i.e. the Chamberlain profiles), in particular at noon and midnight. We finally show the existence of an exopause that appears naturally as the external limit for bounded particles, above which all particles are escaping

    Addressing social issues in a universal HIV test and treat intervention trial (ANRS 12249 TasP) in South Africa: methods for appraisal

    Get PDF
    Background: The Universal HIV Test and Treat (UTT) strategy represents a challenge for science, but is also a challenge for individuals and societies. Are repeated offers of provider-initiated HIV testing and immediate antiretroviral therapy (ART) socially-acceptable and can these become normalized over time? Can UTT be implemented without potentially adding to individual and community stigma, or threatening individual rights? What are the social, cultural and economic implications of UTT for households and communities? And can UTT be implemented within capacity constraints and other threats to the overall provision of HIV services? The answers to these research questions will be critical for routine implementation of UTT strategies. Methods/design: A social science research programme is nested within the ANRS 12249 Treatment-as-Prevention (TasP) cluster-randomised trial in rural South Africa. The programme aims to inform understanding of the (i) social, economic and environmental factors affecting uptake of services at each step of the continuum of HIV prevention, treatment and care and (ii) the causal impacts of the TasP intervention package on social and economic factors at the individual, household, community and health system level. We describe a multidisciplinary, multi-level, mixed-method research protocol that includes individual, household, community and clinic surveys, and combines quantitative and qualitative methods. Discussion: The UTT strategy is changing the overall approach to HIV prevention, treatment and care, and substantial social consequences may be anticipated, such as changes in social representations of HIV transmission, prevention, HIV testing and ART use, as well as changes in individual perceptions and behaviours in terms of uptake and frequency of HIV testing and ART initiation at high CD4. Triangulation of social science studies within the ANRS 12249 TasP trial will provide comprehensive insights into the acceptability and feasibility of the TasP intervention package at individual, community, patient and health system level, to complement the trial's clinical and epidemiological outcomes. It will also increase understanding of the causal impacts of UTT on social and economic outcomes, which will be critical for the long-term sustainability and routine UTT implementation. Trial registration: Clinicaltrials.gov: NCT01509508; South African Trial Register: DOH-27-0512-3974

    Modeling Identity Disclosure Risk Estimation Using Kenyan Situation

    Get PDF
    Identity disclosure risk is an essential consideration in data anonymization aimed at preserving privacy and utility. The risk is regionally dependent. Therefore, there is a need for a regional empirical approach in addition to a theoretical approach in modeling disclosure risk estimation. Reviewed literature pointed to three influencers of the risk. However, we did not find literature on the combined effects of the three influencers and their predictive power. To fill the gap, this study modeled the risk estimation predicated on the combined effect of the three predictors using the Kenyan situation. The study validated the model by conducting an actual re-identification quasi-experiment. The adversary’s analytical competence, distinguishing power of the anonymized datasets, and linkage mapping of the identified datasets are presented as the predictors of the risk estimation. For each predictor, manifest variables are presented. Our presented model extends previous models and is capable of producing a realistic risk estimation
    corecore