47 research outputs found

    Anonimity in Health-oriented Social Networks

    Get PDF
    AbstractWeb 2.0 philosophy of content creation has spread to many domains including healthcare. A new trend in social networks is to serve specialized, niche-oriented participants, with unique interests. We study what are the advantages, disadvantages and concerns when deploying a social network in the health domain; to be used for sharing information about treatment outcomes and conditions. Our paper raises the privacy concern, namely anonymization, when different stakeholders use such a platform

    k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY

    Full text link

    A Clustering K

    Get PDF
    Wearable technology is one of the greatest applications of the Internet of Things. The popularity of wearable devices has led to a massive scale of personal (user-specific) data. Generally, data holders (manufacturers) of wearable devices are willing to share these data with others to get benefits. However, significant privacy concerns would arise when sharing the data with the third party in an improper manner. In this paper, we first propose a specific threat model about the data sharing process of wearable devices’ data. Then we propose a K-anonymity method based on clustering to preserve privacy of wearable IoT devices’ data and guarantee the usability of the collected data. Experiment results demonstrate the effectiveness of the proposed method

    Anonymity-preserving location data publishing

    Get PDF
    Advances in wireless communication and positioning technology have made possible the identification of a user\u27s location and hence collect large volumes of personal location data. While such data are useful to many organizations, making them publicly accessible is generally prohibited because location data may imply sensitive private information. This thesis investigates the challenges inherent in publishing location data while preserving location privacy of data subjects. Since location data itself may lead to subject re-identification, simply removing user identity from location data is not sufficient for anonymity preservation, and other measures must be employed. We provide a literature survey and discuss limitations of related work on this problem. We then propose a novel location depersonalization technique that produces efficient depersonalization of large volumes of location data. The proposed technique is evaluated using simulation. Our study shows that it is possible to guarantee a desired level of anonymity protection while allowing accurate location data to be published

    Computational disclosure control : a primer on data privacy protection

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (leaves 213-216) and index.Today's globally networked society places great demand on the dissemination and sharing of person specific data for many new and exciting uses. When these data are linked together, they provide an electronic shadow of a person or organization that is as identifying and personal as a fingerprint even when the information contains no explicit identifiers, such as name and phone number. Other distinctive data, such as birth date and ZIP code, often combine uniquely and can be linked to publicly available information to re-identify individuals. Producing anonymous data that remains specific enough to be useful is often a very difficult task and practice today tends to either incorrectly believe confidentiality is maintained when it is not or produces data that are practically useless. The goal of the work presented in this book is to explore computational techniques for releasing useful information in such a way that the identity of any individual or entity contained in data cannot be recognized while the data remain practically useful. I begin by demonstrating ways to learn information about entities from publicly available information. I then provide a formal framework for reasoning about disclosure control and the ability to infer the identities of entities contained within the data. I formally define and present null-map, k-map and wrong-map as models of protection. Each model provides protection by ensuring that released information maps to no, k or incorrect entities, respectively. The book ends by examining four computational systems that attempt to maintain privacy while releasing electronic information. These systems are: (1) my Scrub System, which locates personally-identifying information in letters between doctors and notes written by clinicians; (2) my Datafly II System, which generalizes and suppresses values in field-structured data sets; (3) Statistics Netherlands' pt-Argus System, which is becoming a European standard for producing public-use data; and, (4) my k-Similar algorithm, which finds optimal solutions such that data are minimally distorted while still providing adequate protection. By introducing anonymity and quality metrics, I show that Datafly II can overprotect data, Scrub and p-Argus can fail to provide adequate protection, but k-similar finds optimal results.by Latanya Sweeney.Ph.D

    Utility-Based Privacy Preserving Data Publishing

    Get PDF
    Advances in data collection techniques and need for automation triggered in proliferation of a huge amount of data. This exponential increase in the collection of personal information has for some time represented a serious threat to privacy. With the advancement of technologies for data storage, data mining, machine learning, social networking and cloud computing, the problem is further fueled. Privacy is a fundamental right of every human being and needs to be preserved. As a counterbalance to the socio-technical transformations, most nations have both general policies on preserving privacy and specic legislation to control access to and use of data. Privacy preserving data publishing is the ability to control the dissemination and use of one's personal information. Mere publishing (or sharing) of original data in raw form results in identity disclosure with linkage attacks. To overcome linkage attacks, the techniques of statistical disclosure control are employed. One such approach is k-anonymity that reduce data across a set of key variables to a set of classes. In a k-anonymized dataset each record is indistinguishable from at least k-1 others, meaning that an attacker cannot link the data records to population units with certainty thus reducing the probability of disclosure. Algorithms that have been proposed to enforce k-anonymity are Samarati's algorithm and Sweeney's Datafly algorithm. Both of these algorithms adhere to full domain generalization with global recording. These methods have a tradeo between utility, computing time and information loss. A good privacy preserving technique should ensure a balance of utility and privacy, giving good performance and level of uncertainty. In this thesis, we propose an improved greedy heuristic that maintains a balance between utility, privacy, computing time and information loss. Given a dataset and k, constructing the dataset to k-anonymous dataset can be done by the above-mentioned schemes. One of the challenges is to nd the best value of k, when the dataset is provided. In this thesis, a scheme has been proposed to achieve the best value of k for a given dataset. The k-anonymity scheme suers from homogeneity attack. As a result, the l-diverse scheme was developed. It states that the diversity of domain values of the dataset in an equivalence class should be l. The l-diversity scheme suers from background knowledge attack. To address this problem, t-closeness scheme was proposed. The t-closeness principle states that the distribution of records in an equivalence class and the distribution of records in the table should not exceed more than t. The drawback with this scheme is that, the distance metric deployed in constructing a table, satisfying t-closeness, does not follow the distance characteristics. In this thesis, we have deployed an alternative distance metric namely, Hellinger metric, for constructing a t-closeness table. The t-closeness scheme with this alternative distance metric performed better with respect to the discernability metric and computing time. The k-anonymity, l-diversity and t-closeness schemes can be used to anonymize the dataset before publishing (releasing or sharing). This is generally in a static environment. There are also data that need to be published in a dynamic environment. One such example is a social network. Anonymizing social networks poses great challenges. Solutions suggested till date do not consider utility of the data while anonymizing. In this thesis, we propose a novel scheme to anonymize the users depending on their importance and take utility into consideration. Importance of a node was decided by the centrality and prestige measures. Hence, the utility and privacy of the users are balanced
    corecore