4 research outputs found
Multiplicative noise for masking numerical microdata with constraints
Before releasing databases which contain sensitive information about individuals, statistical agencies have to apply Statistical Disclosure Limitation (SDL) methods to such data. The goal of these methods is to minimize the risk of disclosure of the confidential information and at the same time provide legitimate data users with accurate information about the population of interest. SDL methods applicable to the microdata (i.e. collection of individual records) are often called masking methods. In this paper, several multiplicative noise masking schemes are presented. These schemes are designed to preserve positivity and inequality constraints in the data together with the vector of means and covariance matrix
Priv Stat Databases
In this paper we propose a method for statistical disclosure limitation of categorical variables that we call Conditional Group Swapping. This approach is suitable for design and strata-defining variables, the cross-classification of which leads to the formation of important groups or subpopulations. These groups are considered important because from the point of view of data analysis it is desirable to preserve analytical characteristics within them. In general data swapping can be quite distorting ([12, 18, 15]), especially for the relationships between the variables not only within the subpopulations but for the overall data. To reduce the damage incurred by swapping, we propose to choose the records for swapping using conditional probabilities which depend on the characteristics of the exchanged records. In particular, our approach exploits the results of propensity scores methodology for the computation of swapping probabilities. The experimental results presented in the paper show good utility properties of the method.CC999999/ImCDC/Intramural CDC HHS/United States2020-03-23T00:00:00Z32206763PMC70874077412vault:3515
Transparent Privacy is Principled Privacy
Differential privacy revolutionizes the way we think about statistical
disclosure limitation. Among the benefits it brings to the table, one is
particularly profound and impactful. Under this formal approach to privacy, the
mechanism with which data is privatized can be spelled out in full
transparency, without sacrificing the privacy guarantee. Curators of
open-source demographic and scientific data are at a position to offer privacy
without obscurity. This paper supplies a technical treatment to the pitfalls of
obscure privacy, and establishes transparent privacy as a prerequisite to
drawing correct statistical inference. It advocates conceiving transparent
privacy as a dynamic component that can improve data quality from the total
survey error perspective, and discusses the limited statistical usability of
mere procedural transparency which may arise when dealing with mandated
invariants. Transparent privacy is the only viable path towards principled
inference from privatized data releases. Its arrival marks great progress
towards improved reproducibility, accountability and public trust.Comment: 2 figure
Anonimización de datos continuos utilizando análisis factorial
Fac. de Estudios EstadÃsticosTRUEpu