3 research outputs found

    Data Anonymization that Leads to the Most Accurate Estimates of Statistical Characteristics

    No full text
    Abstract—To preserve privacy, we divide the data space into boxes, and instead of original data points, only store the corresponding boxes. In accordance with the current practice, the desired level of privacy is established by having at least k different records in each box, for a given value k (the larger the value k, the higher the privacy level). When we process the data, then the use of boxes instead of the original exact values leads to uncertainty. In this paper, we find the (asymptotically) optimal subdivision of data into boxes, a subdivision that provides, for a given statistical characteristic like variance, covariance, or correlation, the smallest uncertainty within the given level of privacy. In areas where the empirical data density is small, boxes containing k points are large in size, which results in large uncertainty. To avoid this, we propose, when computing th

    Data Anonymization that Leads to the Most Accurate Estimates of Statistical Characteristics: Fuzzy-Motivated Approach

    No full text
    Abstract—To preserve privacy, the original data points (with exact values) are replaced by boxes containing each (inaccessible) data point. This privacy-motivated uncertainty leads to uncertainty in the statistical characteristics computed based on this data. In a previous paper, we described how to minimize this uncertainty under the assumption that we use the same standard statistical estimates for the desired characteristics. In this paper, we show that we can further decrease the resulting uncertainty if we allow fuzzy-motivated weighted estimates, and we explain how to optimally select the corresponding weights. I. FORMULATION OF THE PROBLEM Need to preserve privacy. In many practical applications, e.g., in medicine and in education, to better serve customers, it is important to know as much as possible about the potential customers. Customers are often reluctant to share information
    corecore