637 research outputs found
Revisiting distance-based record linkage for privacy-preserving release of statistical datasets
Statistical Disclosure Control (SDC, for short) studies the problem of privacy-preserving data publishing in cases where the data is expected to be used for statistical analysis. An original dataset T containing sensitive information is transformed into a sanitized version T' which is released to the public. Both utility and privacy aspects are very important in this setting. For utility, T' must allow data miners or statisticians to obtain similar results to those which would have been obtained from the original dataset T. For privacy, T' must significantly reduce the ability of an adversary to infer sensitive information on the data subjects in T. One of the main a-posteriori measures that the SDC community has considered up to now when analyzing the privacy offered by a given protection method is the Distance-Based Record Linkage (DBRL) risk measure. In this work, we argue that the classical DBRL risk measure is insufficient. For this reason, we introduce the novel Global Distance-Based Record Linkage (GDBRL) risk measure. We claim that this new measure must be evaluated alongside the classical DBRL measure in order to better assess the risk in publishing T' instead of T. After that, we describe how this new measure can be computed by the data owner and discuss the scalability of those computations. We conclude by extensive experimentation where we compare the risk assessments offered by our novel measure as well as by the classical one, using well-known SDC protection methods. Those experiments validate our hypothesis that the GDBRL risk measure issues, in many cases, higher risk assessments than the classical DBRL measure. In other words, relying solely on the classical DBRL measure for risk assessment might be misleading, as the true risk may be in fact higher. Hence, we strongly recommend that the SDC community considers the new GDBRL risk measure as an additional measure when analyzing the privacy offered by SDC protection algorithms.Postprint (author's final draft
An Evolutionary Optimization Approach for Categorical Data Protection
The continuous growing amount of public sensible data has increased the risk of breaking the privacy of people or institutions in those datasets. Many protection methods have been developed to solve this problem by either distorting or generalizing data but taking into account the difficult tradeoff between data utility (information loss) and protection against disclosure (disclosure risk). In this paper we present an optimization approach for data protection based on an evolutionary algorithm which is guided by a combination of information loss and disclosure risk measures. In this way, state-of-the-art protection methods are combined to obtain new data protections with a better trade-off between these two measures. The paper presents several experimental results that assess the performance of our approach
Data privacy
Data privacy studies methods, tools, and theory to avoid the disclosure of sensitive information. Its origin is in statistics with the goal to ensure the confidentiality of data gathered from census and questionnaires. The topic was latter introduced in computer science and more particularly in data mining, where due to the large amount of data currently available, has attracted the interest of researchers, practitioners, and companies. In this paper we will review the main topics related to data privacy and privacy-enhancing technologies
Implementing privacy-preserving filters in the MOA stream mining framework
[CATALÀ] S'han implementat mètodes d'SDC en quatre filtres de privacitat pel software MOA. Els algorismes han estat adaptats de solucions conegudes per habilitar el seu ús en entorns de processament de fluxos. Finalment, han estat avaluats en termes del risc de revelació i la pèrdua d'informació.[ANGLÈS] Four MOA privacy-preserving filters have been developed to implement some SDC methods. The algorithms have been adapted from well-known solutions to enable their use in streaming settings. Finally, they have been benchmarked to assess their quality in terms of disclosure risk and information loss
Motivating Executives: Does Performance-Based Compensation Positively Affect Managerial Performance?
Protecting Micro-Data Privacy: The Moment-Based Density Estimation Method and its Application
Privacy concerns pertaining to the release of confidential micro-level information are increasingly relevant to organisations and institutions. Controlling the dissemination of disclosure-prone micro-data by means of suppression, aggregation and perturbation techniques often entails different levels of effectiveness and drawbacks depending on the context and properties of the data.
In this dissertation, we briefly review existing disclosure control methods for microdata and undertake a study demonstrating the applicability of micro-data methods to proportion data. This is achieved by using the sample size efficiency related to a simple hypothesis test for a fixed significance level and power, as a measure of statistical utility. We compare a query-based differential privacy mechanism to the multiplicative noise method for disclosure control and demonstrate that with the correct specification of noise parameters, the multiplicative noise method, which is a micro-data based method, achieves similar disclosure protection properties with reduced statistical efficiency costs
Motivating Executives: Does Performance-Based Compensation Positively Affect Managerial Performance?
- …