289 research outputs found

    On the anonymity risk of time-varying user profiles.

    Get PDF
    Websites and applications use personalisation services to profile their users, collect their patterns and activities and eventually use this data to provide tailored suggestions. User preferences and social interactions are therefore aggregated and analysed. Every time a user publishes a new post or creates a link with another entity, either another user, or some online resource, new information is added to the user profile. Exposing private data does not only reveal information about single users’ preferences, increasing their privacy risk, but can expose more about their network that single actors intended. This mechanism is self-evident in social networks where users receive suggestions based on their friends’ activities. We propose an information-theoretic approach to measure the differential update of the anonymity risk of time-varying user profiles. This expresses how privacy is affected when new content is posted and how much third-party services get to know about the users when a new activity is shared. We use actual Facebook data to show how our model can be applied to a real-world scenario.Peer ReviewedPostprint (published version

    Privacy protection of user profiles in personalized information systems

    Get PDF
    In recent times we are witnessing the emergence of a wide variety of information systems that tailor the information-exchange functionality to meet the specific interests of their users. Most of these personalized information systems capitalize on, or lend themselves to, the construction of profiles, either directly declared by a user, or inferred from past activity. The ability of these systems to profile users is therefore what enables such intelligent functionality, but at the same time, it is the source of serious privacy concerns. Although there exists a broad range of privacy-enhancing technologies aimed to mitigate many of those concerns, the fact is that their use is far from being widespread. The main reason is that there is a certain ambiguity about these technologies and their effectiveness in terms of privacy protection. Besides, since these technologies normally come at the expense of system functionality and utility, it is challenging to assess whether the gain in privacy compensates for the costs in utility. Assessing the privacy provided by a privacy-enhancing technology is thus crucial to determine its overall benefit, to compare its effectiveness with other technologies, and ultimately to optimize it in terms of the privacy-utility trade-off posed. Considerable effort has consequently been devoted to investigating both privacy and utility metrics. However, most of these metrics are specific to concrete systems and adversary models, and hence are difficult to generalize or translate to other contexts. Moreover, in applications involving user profiles, there are a few proposals for the evaluation of privacy, and those existing are not appropriately justified or fail to justify the choice. The first part of this thesis approaches the fundamental problem of quantifying user privacy. Firstly, we present a theoretical framework for privacy-preserving systems, endowed with a unifying view of privacy in terms of the estimation error incurred by an attacker who aims to disclose the private information that the system is designed to conceal. Our theoretical analysis shows that numerous privacy metrics emerging from a broad spectrum of applications are bijectively related to this estimation error, which permits interpreting and comparing these metrics under a common perspective. Secondly, we tackle the issue of measuring privacy in the enthralling application of personalized information systems. Specifically, we propose two information-theoretic quantities as measures of the privacy of user profiles, and justify these metrics by building on Jaynes' rationale behind entropy-maximization methods and fundamental results from the method of types and hypothesis testing. Equipped with quantifiable measures of privacy and utility, the second part of this thesis investigates privacy-enhancing, data-perturbative mechanisms and architectures for two important classes of personalized information systems. In particular, we study the elimination of tags in semantic-Web applications, and the combination of the forgery and the suppression of ratings in personalized recommendation systems. We design such mechanisms to achieve the optimal privacy-utility trade-off, in the sense of maximizing privacy for a desired utility, or vice versa. We proceed in a systematic fashion by drawing upon the methodology of multiobjective optimization. Our theoretical analysis finds a closed-form solution to the problem of optimal tag suppression, and to the problem of optimal forgery and suppression of ratings. In addition, we provide an extensive theoretical characterization of the trade-off between the contrasting aspects of privacy and utility. Experimental results in real-world applications show the effectiveness of our mechanisms in terms of privacy protection, system functionality and data utility

    A privacy-protecting architecture for collaborative filtering via forgery and suppression of ratings

    No full text
    Recommendation systems are information-filtering systems that help users deal with information overload. Unfortunately, current recommendation systems prompt serious privacy concerns. In this work, we propose an architecture that protects user privacy in such collaborative-filtering systems, in which users are profiled on the basis of their ratings. Our approach capitalizes on the combination of two perturbative techniques, namely the forgery and the suppression of ratings. In our scenario, users rate those items they have an opinion on. However, in order to avoid privacy risks, they may want to refrain from rating some of those items, and/or rate some items that do not reflect their actual preferences. On the other hand, forgery and suppression may degrade the quality of the recommendation system. Motivated by this, we describe the implementation details of the proposed architecture and present a formulation of the optimal trade-off among privacy, forgery rate and suppression rate. Finally, we provide a numerical example that illustrates our formulation.Peer ReviewedPostprint (published version

    Evaluation of Anonymized ONS Queries

    Full text link
    Electronic Product Code (EPC) is the basis of a pervasive infrastructure for the automatic identification of objects on supply chain applications (e.g., pharmaceutical or military applications). This infrastructure relies on the use of the (1) Radio Frequency Identification (RFID) technology to tag objects in motion and (2) distributed services providing information about objects via the Internet. A lookup service, called the Object Name Service (ONS) and based on the use of the Domain Name System (DNS), can be publicly accessed by EPC applications looking for information associated with tagged objects. Privacy issues may affect corporate infrastructures based on EPC technologies if their lookup service is not properly protected. A possible solution to mitigate these issues is the use of online anonymity. We present an evaluation experiment that compares the of use of Tor (The second generation Onion Router) on a global ONS/DNS setup, with respect to benefits, limitations, and latency.Comment: 14 page

    A privacy-protecting architecture for recommendation systems via the suppression of ratings

    No full text
    Recommendation systems are information-filtering systems that help users deal with information overload. Unfortunately, current recommendation systems prompt serious privacy concerns. In this work, we propose an architecture that enables users to enhance their privacy in those systems that profile users on the basis of the items rated. Our approach capitalizes on a conceptually-simple perturbative technique, namely the suppression of ratings. In our scenario, users rate those items they have an opinion on. However, in order to avoid being accurately profiled, they may want to refrain from rating certain items. Consequently, this technique protects user privacy to a certain extent, but at the cost of a degradation in the accuracy of the recommendation. We measure privacy risk as the Kullback-Leibler divergence between the user's and the population's rating distribution, a privacy criterion that we proposed in previous work. The justification of such a criterion is our second contribution. Concretely, we thoroughly interpret it by elaborating on the intimate connection between the celebrated method of entropy maximization and the use of entropies and divergences as measures of privacy. The ultimate purpose of this justification is to attempt to bridge the gap between the privacy and the information-theoretic communities by substantially adapting some technicalities of our original work to reach a wider audience, not intimately familiar with information theory and the method of types. Lastly, we present a formulation of the optimal trade-o_ between privacy and suppression rate, what allows us to formally specify one of the functional blocks of the proposed architecture.Peer ReviewedPreprin

    Optimal forgery and suppression of ratings for privacy enhancement in recommendation systems

    Get PDF
    Recommendation systems are information-filtering systems that tailor information to users on the basis of knowledge about their preferences. The ability of these systems to profile users is what enables such intelligent functionality, but at the same time, it is the source of serious privacy concerns. In this paper we investigate a privacy-enhancing technology that aims at hindering an attacker in its efforts to accurately profile users based on the items they rate. Our approach capitalizes on the combination of two perturbative mechanisms—the forgery and the suppression of ratings. While this technique enhances user privacy to a certain extent, it inevitably comes at the cost of a loss in data utility, namely a degradation of the recommendation’s accuracy. In short, it poses a trade-off between privacy and utility. The theoretical analysis of such trade-off is the object of this work. We measure privacy as the Kullback-Leibler divergence between the user’s and the population’s item distributions, and quantify utility as the proportion of ratings users consent to forge and eliminate. Equipped with these quantitative measures, we find a closed-form solution to the problem of optimal forgery and suppression of ratings, an optimization problem that includes, as a particular case, the maximization of the entropy of the perturbed profile. We characterize the optimal trade-off surface among privacy, forgery rate and suppression rate,and experimentally evaluate how our approach could contribute to privacy protection in a real-world recommendation system.Peer ReviewedPostprint (published version

    p-probabilistic k-anonymous microaggregation for the anonymization of surveys with uncertain participation

    Get PDF
    We develop a probabilistic variant of k-anonymous microaggregation which we term p-probabilistic resorting to a statistical model of respondent participation in order to aggregate quasi-identifiers in such a manner that k-anonymity is concordantly enforced with a parametric probabilistic guarantee. Succinctly owing the possibility that some respondents may not finally participate, sufficiently larger cells are created striving to satisfy k-anonymity with probability at least p. The microaggregation function is designed before the respondents submit their confidential data. More precisely, a specification of the function is sent to them which they may verify and apply to their quasi-identifying demographic variables prior to submitting the microaggregated data along with the confidential attributes to an authorized repository. We propose a number of metrics to assess the performance of our probabilistic approach in terms of anonymity and distortion which we proceed to investigate theoretically in depth and empirically with synthetic and standardized data. We stress that in addition to constituting a functional extension of traditional microaggregation, thereby broadening its applicability to the anonymization of statistical databases in a wide variety of contexts, the relaxation of trust assumptions is arguably expected to have a considerable impact on user acceptance and ultimately on data utility through mere availability.Peer ReviewedPostprint (author's final draft

    A Privacy-Preserving Architecture for the Semantic Web Based on Tag Suppression

    No full text
    We propose an architecture that preserves user privacy in the semantic Web via tag suppression. In tag suppression, users may wish to tag some resources and refrain from tagging some others in order to hinder privacy attackers in their efforts to profile users’ interests. Following this strategy, our architecture helps users decide which tags should be suppressed. We describe the implementation details of the proposed architecture and provide further insight into the modeling of profiles. In addition, we present a mathematical formulation of the optimal tradeoff between privacy and tag suppression rate.Peer ReviewedPostprint (published version
    • …
    corecore