317,458 research outputs found
k-anonymous Microdata Release via Post Randomisation Method
The problem of the release of anonymized microdata is an important topic in
the fields of statistical disclosure control (SDC) and privacy preserving data
publishing (PPDP), and yet it remains sufficiently unsolved. In these research
fields, k-anonymity has been widely studied as an anonymity notion for mainly
deterministic anonymization algorithms, and some probabilistic relaxations have
been developed. However, they are not sufficient due to their limitations,
i.e., being weaker than the original k-anonymity or requiring strong parametric
assumptions. First we propose Pk-anonymity, a new probabilistic k-anonymity,
and prove that Pk-anonymity is a mathematical extension of k-anonymity rather
than a relaxation. Furthermore, Pk-anonymity requires no parametric
assumptions. This property has a significant meaning in the viewpoint that it
enables us to compare privacy levels of probabilistic microdata release
algorithms with deterministic ones. Second, we apply Pk-anonymity to the post
randomization method (PRAM), which is an SDC algorithm based on randomization.
PRAM is proven to satisfy Pk-anonymity in a controlled way, i.e, one can
control PRAM's parameter so that Pk-anonymity is satisfied. On the other hand,
PRAM is also known to satisfy -differential privacy, a recent
popular and strong privacy notion. This fact means that our results
significantly enhance PRAM since it implies the satisfaction of both important
notions: k-anonymity and -differential privacy.Comment: 22 pages, 4 figure
Microdata protection through approximate microaggregation
Microdata protection is a hot topic in the field of Statistical Disclosure Control, which has gained special interest after the disclosure of 658000 queries by the America Online (AOL) search engine in August 2006. Many algorithms, methods and properties have been proposed to deal with microdata disclosure. One of the emerging
concepts in microdata protection is k-anonymity, introduced by Samarati and Sweeney. k-anonymity provides a simple and efficient approach to protect private individual information and is gaining increasing popularity. k-anonymity requires that every record in the microdata table released be indistinguishably related to no fewer than k respondents.
In this paper, we apply the concept of entropy to propose a distance metric to evaluate the amount of mutual information among records in microdata, and propose a method of constructing dependency tree to find the key attributes, which we then use to process approximate microaggregation. Further, we adopt this new microaggregation technique to study -anonymity problem, and an efficient algorithm is developed. Experimental results show that the proposed microaggregation technique is efficient and effective in the terms of running time and information loss
On the Anonymization of Differentially Private Location Obfuscation
Obfuscation techniques in location-based services (LBSs) have been shown
useful to hide the concrete locations of service users, whereas they do not
necessarily provide the anonymity. We quantify the anonymity of the location
data obfuscated by the planar Laplacian mechanism and that by the optimal
geo-indistinguishable mechanism of Bordenabe et al. We empirically show that
the latter provides stronger anonymity than the former in the sense that more
users in the database satisfy k-anonymity. To formalize and analyze such
approximate anonymity we introduce the notion of asymptotic anonymity. Then we
show that the location data obfuscated by the optimal geo-indistinguishable
mechanism can be anonymized by removing a smaller number of users from the
database. Furthermore, we demonstrate that the optimal geo-indistinguishable
mechanism has better utility both for users and for data analysts.Comment: ISITA'18 conference pape
Parameterized Complexity of the k-anonymity Problem
The problem of publishing personal data without giving up privacy is becoming
increasingly important. An interesting formalization that has been recently
proposed is the -anonymity. This approach requires that the rows of a table
are partitioned in clusters of size at least and that all the rows in a
cluster become the same tuple, after the suppression of some entries. The
natural optimization problem, where the goal is to minimize the number of
suppressed entries, is known to be APX-hard even when the records values are
over a binary alphabet and , and when the records have length at most 8
and . In this paper we study how the complexity of the problem is
influenced by different parameters. In this paper we follow this direction of
research, first showing that the problem is W[1]-hard when parameterized by the
size of the solution (and the value ). Then we exhibit a fixed parameter
algorithm, when the problem is parameterized by the size of the alphabet and
the number of columns. Finally, we investigate the computational (and
approximation) complexity of the -anonymity problem, when restricting the
instance to records having length bounded by 3 and . We show that such a
restriction is APX-hard.Comment: 22 pages, 2 figure
- …