929,118 research outputs found

    From the Information Bottleneck to the Privacy Funnel

    Full text link
    We focus on the privacy-utility trade-off encountered by users who wish to disclose some information to an analyst, that is correlated with their private data, in the hope of receiving some utility. We rely on a general privacy statistical inference framework, under which data is transformed before it is disclosed, according to a probabilistic privacy mapping. We show that when the log-loss is introduced in this framework in both the privacy metric and the distortion metric, the privacy leakage and the utility constraint can be reduced to the mutual information between private data and disclosed data, and between non-private data and disclosed data respectively. We justify the relevance and generality of the privacy metric under the log-loss by proving that the inference threat under any bounded cost function can be upper-bounded by an explicit function of the mutual information between private data and disclosed data. We then show that the privacy-utility tradeoff under the log-loss can be cast as the non-convex Privacy Funnel optimization, and we leverage its connection to the Information Bottleneck, to provide a greedy algorithm that is locally optimal. We evaluate its performance on the US census dataset

    Data Privacy and Dignitary Privacy: Google Spain, the Right To Be Forgotten, and the Construction of the Public Sphere

    Get PDF
    The 2014 decision of the European Court of Justice in Google Spain controversially held that the fair information practices set forth in European Union (EU) Directive 95/46/EC (Directive) require that Google remove from search results links to websites that contain true information. Google Spain held that the Directive gives persons a “right to be forgotten.” At stake in Google Spain are values that involve both privacy and freedom of expression. Google Spain badly analyzes both. With regard to the latter, Google Spain fails to recognize that the circulation of texts of common interest among strangers makes possible the emergence of a “public” capable of forming the “public opinion” that is essential for democratic self-governance. As the rise of American newspapers in the nineteenth and twentieth centuries demonstrates, the press underwrites the public sphere by creating a structure of communication both responsive to public curiosity and independent of the content of any particular news story. Google, even though it is not itself an author, sustains the contemporary virtual public sphere by creating an analogous structure of communication. With regard to privacy values, EU law, like the laws of many nations, recognizes two distinct forms of privacy. The first is data privacy, which is protected by the fair information practices contained in the Directive. These practices regulate the processing of personal information to ensure (among other things) that such information is used only for the specified purposes for which it has been legally gathered. Data privacy operates according to an instrumental logic, and it seeks to endow persons with “control” over their personal data. Data subjects need not demonstrate harm in order to establish violations of data privacy. The second form of privacy recognized by EU law is dignitary privacy. Article 7 of the Charter of Fundamental Rights of the European Union protects the dignity of persons by regulating inappropriate communications that threaten to degrade, humiliate, or mortify them. Dignitary privacy follows a normative logic designed to prevent harm to personality caused by the violation of civility rules. There are the same privacy values as those safeguarded by the American tort of public disclosure of private facts. Throughout the world, courts protect dignitary privacy by balancing the harm that a communication may cause to personality against legitimate public interests in the communication. The instrumental logic of data privacy is inapplicable to public discourse, which is why the Directive contains derogations for journalistic activities. The communicative action characteristic of the public sphere is made up of intersubjective dialogue, which is antithetical both to the instrumental rationality of data privacy and to its aspiration to ensure individual control of personal information. Because the Google search engine underwrites the public sphere in which public discourse takes place, Google Spain should not have applied fair information practices to Google searches. But the Google Spain opinion also invokes Article 7, and in the end the decision creates doctrinal rules that are roughly approximate to those used to protect dignitary privacy. The Google Spain opinion is thus deeply confused about the kind of privacy it wishes to protect. It is impossible to ascertain whether the decision seeks to protect data privacy or dignitary privacy. Google Spain is ultimately pushed in the direction of dignitary privacy because data privacy is incompatible with public discourse, whereas dignitary privacy may be reconciled with the requirements of public discourse. Insofar as freedom of expression is valued because it fosters democratic self-government, public discourse cannot serve as an effective instrument of self-determination without a modicum of civility. Yet the Google Spain decision recognizes dignitary privacy only in a rudimentary and unsatisfactory way. If it had more clearly focused on the requirements of dignitary privacy, Google Spain would not so sharply have distinguished Google links from the underlying websites to which they refer. Google Spain would not have blithely outsourced the enforcement of the right to be forgotten to a private corporation like Google

    Privacy-enhancing Aggregation of Internet of Things Data via Sensors Grouping

    Full text link
    Big data collection practices using Internet of Things (IoT) pervasive technologies are often privacy-intrusive and result in surveillance, profiling, and discriminatory actions over citizens that in turn undermine the participation of citizens to the development of sustainable smart cities. Nevertheless, real-time data analytics and aggregate information from IoT devices open up tremendous opportunities for managing smart city infrastructures. The privacy-enhancing aggregation of distributed sensor data, such as residential energy consumption or traffic information, is the research focus of this paper. Citizens have the option to choose their privacy level by reducing the quality of the shared data at a cost of a lower accuracy in data analytics services. A baseline scenario is considered in which IoT sensor data are shared directly with an untrustworthy central aggregator. A grouping mechanism is introduced that improves privacy by sharing data aggregated first at a group level compared as opposed to sharing data directly to the central aggregator. Group-level aggregation obfuscates sensor data of individuals, in a similar fashion as differential privacy and homomorphic encryption schemes, thus inference of privacy-sensitive information from single sensors becomes computationally harder compared to the baseline scenario. The proposed system is evaluated using real-world data from two smart city pilot projects. Privacy under grouping increases, while preserving the accuracy of the baseline scenario. Intra-group influences of privacy by one group member on the other ones are measured and fairness on privacy is found to be maximized between group members with similar privacy choices. Several grouping strategies are compared. Grouping by proximity of privacy choices provides the highest privacy gains. The implications of the strategy on the design of incentives mechanisms are discussed
    corecore