11 research outputs found

    You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information

    Get PDF
    Metadata are associated to most of the information we produce in our daily interactions and communication in the digital world. Yet, surprisingly, metadata are often still catergorized as non-sensitive. Indeed, in the past, researchers and practitioners have mainly focused on the problem of the identification of a user from the content of a message. In this paper, we use Twitter as a case study to quantify the uniqueness of the association between metadata and user identity and to understand the effectiveness of potential obfuscation strategies. More specifically, we analyze atomic fields in the metadata and systematically combine them in an effort to classify new tweets as belonging to an account using different machine learning algorithms of increasing complexity. We demonstrate that through the application of a supervised learning algorithm, we are able to identify any user in a group of 10,000 with approximately 96.7% accuracy. Moreover, if we broaden the scope of our search and consider the 10 most likely candidates we increase the accuracy of the model to 99.22%. We also found that data obfuscation is hard and ineffective for this type of data: even after perturbing 60% of the training data, it is still possible to classify users with an accuracy higher than 95%. These results have strong implications in terms of the design of metadata obfuscation strategies, for example for data set release, not only for Twitter, but, more generally, for most social media platforms.Comment: 11 pages, 13 figures. Published in the Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM 2018). June 2018. Stanford, CA, US

    DiffProtect: Generate Adversarial Examples with Diffusion Models for Facial Privacy Protection

    Full text link
    The increasingly pervasive facial recognition (FR) systems raise serious concerns about personal privacy, especially for billions of users who have publicly shared their photos on social media. Several attempts have been made to protect individuals from being identified by unauthorized FR systems utilizing adversarial attacks to generate encrypted face images. However, existing methods suffer from poor visual quality or low attack success rates, which limit their utility. Recently, diffusion models have achieved tremendous success in image generation. In this work, we ask: can diffusion models be used to generate adversarial examples to improve both visual quality and attack performance? We propose DiffProtect, which utilizes a diffusion autoencoder to generate semantically meaningful perturbations on FR systems. Extensive experiments demonstrate that DiffProtect produces more natural-looking encrypted images than state-of-the-art methods while achieving significantly higher attack success rates, e.g., 24.5% and 25.1% absolute improvements on the CelebA-HQ and FFHQ datasets.Comment: Code will be available at https://github.com/joellliu/DiffProtect

    Privacy Intelligence: A Survey on Image Sharing on Online Social Networks

    Full text link
    Image sharing on online social networks (OSNs) has become an indispensable part of daily social activities, but it has also led to an increased risk of privacy invasion. The recent image leaks from popular OSN services and the abuse of personal photos using advanced algorithms (e.g. DeepFake) have prompted the public to rethink individual privacy needs when sharing images on OSNs. However, OSN image sharing itself is relatively complicated, and systems currently in place to manage privacy in practice are labor-intensive yet fail to provide personalized, accurate and flexible privacy protection. As a result, an more intelligent environment for privacy-friendly OSN image sharing is in demand. To fill the gap, we contribute a systematic survey of 'privacy intelligence' solutions that target modern privacy issues related to OSN image sharing. Specifically, we present a high-level analysis framework based on the entire lifecycle of OSN image sharing to address the various privacy issues and solutions facing this interdisciplinary field. The framework is divided into three main stages: local management, online management and social experience. At each stage, we identify typical sharing-related user behaviors, the privacy issues generated by those behaviors, and review representative intelligent solutions. The resulting analysis describes an intelligent privacy-enhancing chain for closed-loop privacy management. We also discuss the challenges and future directions existing at each stage, as well as in publicly available datasets.Comment: 32 pages, 9 figures. Under revie

    The (Co)-Location Sharing Game

    Get PDF
    Most popular location-based social networks, such as Facebook and Foursquare, let their (mobile) users post location and co-location (involving other users) information. Such posts bring social benefits to the users who post them but also to their friends who view them. Yet, they also represent a severe threat to the users’ privacy, as co-location information introduces interdependences between users. We propose the first game-theoretic framework for analyzing the strategic behaviors, in terms of information sharing, of users of OSNs. To design parametric utility functions that are representative of the users’ actual preferences, we also conduct a survey of 250 Facebook users and use conjoint analysis to quantify the users’ benefits of sharing vs. viewing (co)-location information and their preference for privacy vs. benefits. Our survey findings expose the fact that, among the users, there is a large variation, in terms of these preferences. We extensively evaluate our framework through data-driven numerical simulations. We study how users’ individual preferences influence each other’s decisions, we identify several factors that significantly affect these decisions (among which, the mobility data of the users), and we determine situations where dangerous patterns can emerge (e.g., a vicious circle of sharing, or an incentive to over-share) – even when the users share similar preferences

    Interdependent and Multi-Subject Privacy: Threats, Analysis and Protection

    Get PDF
    In Alan Westin's generally accepted definition of privacy, he describes it as an individual's right 'to control, edit, manage, and delete information about them[selves] and decide when, how, and to what extent information is communicated to others.' Therefore, privacy is an individual and independent human right. The great Mahatma Gandhi once said that 'interdependence is and ought to be as much the ideal of man as selfsufficiency. Man is a social being.' To ensure this independent right to inherently social beings, it will be difficult, if not impossible. This is especially true as today's world is highly interconnected, technology evolves rapidly, data sharing is increasingly abundant, and regulations do not provide sufficient guidance in the realm of interdependency. In this thesis, we explore the topic of interdependent privacy from an adversarial point of view by exposing threats, as well as from an end-user point of view, by exploring awareness, preferences and privacy protection needs. First, we quantify the effect of co-locations on location privacy, considering an adversary such as a social-network operator that has access to this information: Not only can a user be localized due to her reported locations and mobility patterns, but also due to those of her friends (and the friends of her friends and so on). We formalize this problem and propose effective inference algorithms that substantially reduce the complexity of localization attacks that make use of co-locations. Our results show that an adversary can effectively incorporate co-locations in attacks to substantially reduce users' location privacy; this exposes a real and severe threat. Second, we investigate the interplay between the privacy risks and the social benefits of users when sharing (co-)locations on OSNs. We propose a game-theoretic framework for analyzing users' strategic behaviors. We conduct a survey of Facebook users and quantify their benefits of sharing vs. viewing information and their preference for privacy vs. benefits. Our survey exposes deficits in users' awareness of privacy risks in OSNs. Our results further show how users' individual preferences influence, sometimes in a negative way, each other's decisions. Third, we consider various types of interdependent and multi-subject data (photo, colocation, genome, etc.) that often have privacy implications for data subjects other than the uploader, yet can be shared without their consent or awareness. We propose a system for sharing such data in a consensual and privacy-preserving manner. We implement it in the case of photos, by relying on image-processing and cryptographic techniques, as well as on a two-tier architecture. We conduct a survey of Facebook users; it indicates that there is interest in such a system, and that users have increasing privacy concerns due to prejudice or discrimination that they have been or could still easily be exposed to. In conclusion, this thesis provides new insights on users' privacy in the context of interdependence and constitutes a step towards the design of novel privacy-protection mechanisms. It should be seen as a warning message for service providers and regulatory institutions: Unless the interdependent aspects of privacy are considered, this fundamental human right can never be guaranteed

    Data-Driven, Personalized Usable Privacy

    Get PDF
    We live in the "inverse-privacy" world, where service providers derive insights from users' data that the users do not even know about. This has been fueled by the advancements in machine learning technologies, which allowed providers to go beyond the superficial analysis of users' transactions to the deep inspection of users' content. Users themselves have been facing several problems in coping with this widening information discrepancy. Although the interfaces of apps and websites are generally equipped with privacy indicators (e.g., permissions, policies, ...), this has not been enough to create the counter-effect. We particularly identify three of the gaps that hindered the effectiveness and usability of privacy indicators: - Scale Adaptation: The scale at which service providers are collecting data has been growing on multiple fronts. Users, on the other hand, have limited time, effort, and technological resources to cope with this scale. - Risk Communication: Although providers utilize privacy indicators to announce what and (less often) why they need particular pieces of information, they rarely relay what can be potentially inferred from this data. Without this knowledge, users are less equipped to make informed decisions when they sign in to a site or install an application. - Language Complexity: The information practices of service providers are buried in complex, long privacy policies. Generally, users do not have the time and sometimes the skills to decipher such policies, even when they are interested in knowing particular pieces of it. In this thesis, we approach usable privacy from a data perspective. Instead of static privacy interfaces that are obscure, recurring, or unreadable, we develop techniques that bridge the understanding gap between users and service providers. Towards that, we make the following contributions: - Crowdsourced, data-driven privacy decision-making: In an effort to combat the growing scale of data exposure, we consider the context of files uploaded to cloud services. We propose C3P, a framework for automatically assessing the sensitivity of files, thus enabling realtime, fine-grained policy enforcement on top of unstructured data. - Data-driven app privacy indicators: We introduce PrivySeal, which involves a new paradigm of dynamic, personalized app privacy indicators that bridge the risk under- standing gap between users and providers. Through PrivySeal's online platform, we also study the emerging problem of interdependent privacy in the context of cloud apps and provide a usable privacy indicator to mitigate it. - Automated question answering about privacy practices: We introduce PriBot, the first automated question-answering system for privacy policies, which allows users to pose their questions about the privacy practices of any company with their own language. Through a user study, we show its effectiveness at achieving high accuracy and relevance for users, thus narrowing the complexity gap in navigating privacy policies. A core aim of this thesis is paving the road for a future where privacy indicators are not bound by a specific medium or pre-scripted wording. We design and develop techniques that enable privacy to be communicated effectively in an interface that is approachable to the user. For that, we go beyond textual interfaces to enable dynamic, visual, and hands-free privacy interfaces that are fit for the variety of emerging technologies
    corecore