5,331 research outputs found

    Privacy Preserving Data Publishing

    Get PDF
    Recent years have witnessed increasing interest among researchers in protecting individual privacy in the big data era, involving social media, genomics, and Internet of Things. Recent studies have revealed numerous privacy threats and privacy protection methodologies, that vary across a broad range of applications. To date, however, there exists no powerful methodologies in addressing challenges from: high-dimension data, high-correlation data and powerful attackers. In this dissertation, two critical problems will be investigated: the prospects and some challenges for elucidating the attack capabilities of attackers in mining individualsā€™ private information; and methodologies that can be used to protect against such inference attacks, while guaranteeing significant data utility. First, this dissertation has proposed a series of works regarding inference attacks laying emphasis on protecting against powerful adversaries with auxiliary information. In the context of genomic data, data dimensions and computation feasibility is highly challenging in conducting data analysis. This dissertation proved that the proposed attack can effectively infer the values of the unknown SNPs and traits in linear complexity, which dramatically improve the computation cost compared with traditional methods with exponential computation cost. Second, putting differential privacy guarantee into high-dimension and high-correlation data remains a challenging problem, due to high-sensitivity, output scalability and signal-to-noise ratio. Consider there are tens-of-millions of genomes in a human DNA, it is infeasible for traditional methods to introduce noise to sanitize genomic data. This dissertation has proposed a series of works and demonstrated that the proposed differentially private method satisfies differential privacy; moreover, data utility is improved compared with the states of the arts by largely lowering data sensitivity. Third, putting privacy guarantee into social data publishing remains a challenging problem, due to tradeoff requirements between data privacy and utility. This dissertation has proposed a series of works and demonstrated that the proposed methods can effectively realize privacy-utility tradeoff in data publishing. Finally, two future research topics are proposed. The first topic is about Privacy Preserving Data Collection and Processing for Internet of Things. The second topic is to study Privacy Preserving Big Data Aggregation. They are motivated by the newly proposed data mining, artificial intelligence and cybersecurity methods

    Improved Generalization for Secure Data Publishing

    Get PDF
    In data publishing, privacy and utility are essential for data owners and users respectively, which cannot coexist well. This incompatibility puts the data privacy researchers under an obligation to find newer and reliable privacy preserving tradeoff-techniques. Data providers like many public and private organizations (e.g. hospitals and banks) publish microdata of individuals for various research purposes. Publishing microdata may compromise the privacy of individuals. To prevent the privacy of individuals, data must be published after removing personal identifiers like name and social security numbers. Removal of the personal identifiers appears as not enough to protect the privacy of individuals. K-anonymity model is used to publish microdata by preserving the individual's privacy through generalization. There exist many state-of-the-arts generalization-based techniques, which deal with pre-defined attacks like background knowledge attack, similarity attack, probability attack and so on. However, existing generalization-based techniques compromise the data utility while ensuring privacy. It is an open question to find an efficient technique that is able to set a trade-off between privacy and utility. In this paper, we discussed existing generalization hierarchies and their limitations in detail. We have also proposed three new generalization techniques including conventional generalization hierarchies, divisors based generalization hierarchies and cardinality-based generalization hierarchies. Extensive experiments on the real-world dataset acknowledge that our technique outperforms among the existing techniques in terms of better utility

    A Utility-Theoretic Approach to Privacy in Online Services

    Get PDF
    Online offerings such as web search, news portals, and e-commerce applications face the challenge of providing high-quality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on special knowledge about users and their context. For example, a user's demographics, location, and past search and browsing may be useful in enhancing the results offered in response to web search queries. However, reasonable concerns about privacy by both users, providers, and government agencies acting on behalf of citizens, may limit access by services to such information. We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or on-demand manner, in return for expected enhancements in the quality of an online service. We focus on the example of web search and formulate realistic objective functions for search efficacy and privacy. We demonstrate how we can find a provably near-optimal optimization of the utility-privacy tradeoff in an efficient manner. We evaluate our methodology on data drawn from a log of the search activity of volunteer participants. We separately assess usersā€™ preferences about privacy and utility via a large-scale survey, aimed at eliciting preferences about peoplesā€™ willingness to trade the sharing of personal data in returns for gains in search efficiency. We show that a significant level of personalization can be achieved using a relatively small amount of information about users
    • ā€¦
    corecore