721 research outputs found

    Evaluating Privacy-Friendly Mobility Analytics on Aggregate Location Data

    Get PDF
    Information about people's movements and the locations they visit enables a wide number of mobility analytics applications, e.g., real-time traffic maps or urban planning, aiming to improve quality of life in modern smart-cities. Alas, the availability of users' fine-grained location data reveals sensitive information about them such as home and work places, lifestyles, political or religious inclinations. In an attempt to mitigate this, aggregation is often employed as a strategy that allows analytics and machine learning tasks while protecting the privacy of individual users' location traces. In this thesis, we perform an end-to-end evaluation of crowdsourced privacy-friendly location aggregation aiming to understand its usefulness for analytics as well as its privacy implications towards users who contribute their data. First, we present a time-series methodology which, along with privacy-friendly crowdsourcing of aggregate locations, supports mobility analytics such as traffic forecasting and mobility anomaly detection. Next, we design quantification frameworks and methodologies that let us reason about the privacy loss stemming from the collection or release of aggregate location information against knowledgeable adversaries that aim to infer users' profiles, locations, or membership. We then utilize these frameworks to evaluate defenses ranging from generalization and hiding, to differential privacy, which can be employed to prevent inferences on aggregate location statistics, in terms of privacy protection as well as utility loss towards analytics tasks. Our results highlight that, while location aggregation is useful for mobility analytics, it is a weak privacy protection mechanism in this setting and that additional defenses can only protect privacy if some statistical utility is sacrificed. Overall, the tools presented in this thesis can be used by providers who desire to assess the quality of privacy protection before data release and its results have several implications about current location data practices and applications

    Quantifying Sample Anonymity in Score-Based Generative Models with Adversarial Fingerprinting

    Full text link
    Recent advances in score-based generative models have led to a huge spike in the development of downstream applications using generative models ranging from data augmentation over image and video generation to anomaly detection. Despite publicly available trained models, their potential to be used for privacy preserving data sharing has not been fully explored yet. Training diffusion models on private data and disseminating the models and weights rather than the raw dataset paves the way for innovative large-scale data-sharing strategies, particularly in healthcare, where safeguarding patients' personal health information is paramount. However, publishing such models without individual consent of, e.g., the patients from whom the data was acquired, necessitates guarantees that identifiable training samples will never be reproduced, thus protecting personal health data and satisfying the requirements of policymakers and regulatory bodies. This paper introduces a method for estimating the upper bound of the probability of reproducing identifiable training images during the sampling process. This is achieved by designing an adversarial approach that searches for anatomic fingerprints, such as medical devices or dermal art, which could potentially be employed to re-identify training images. Our method harnesses the learned score-based model to estimate the probability of the entire subspace of the score function that may be utilized for one-to-one reproduction of training samples. To validate our estimates, we generate anomalies containing a fingerprint and investigate whether generated samples from trained generative models can be uniquely mapped to the original training samples. Overall our results show that privacy-breaching images are reproduced at sampling time if the models were trained without care.Comment: 10 pages, 6 figure

    The role of Signal Processing in Meeting Privacy Challenges [an overview]

    No full text
    International audienceWith the increasing growth and sophistication of information technology, personal information is easily accessible electronically. This flood of released personal data raises important privacy concerns. However, electronic data sources exist to be used and have tremendous value (utility) to their users and collectors, leading to a tension between privacy and utility. This article aims to quantify that tension by means of an information-theoretic framework and motivate signal processing approaches to privacy problems. The framework is applied to a number of case studies to illustrate concretely how signal processing can be harnessed to provide data privacy

    Adversarial Robustness in Unsupervised Machine Learning: A Systematic Review

    Full text link
    As the adoption of machine learning models increases, ensuring robust models against adversarial attacks is increasingly important. With unsupervised machine learning gaining more attention, ensuring it is robust against attacks is vital. This paper conducts a systematic literature review on the robustness of unsupervised learning, collecting 86 papers. Our results show that most research focuses on privacy attacks, which have effective defenses; however, many attacks lack effective and general defensive measures. Based on the results, we formulate a model on the properties of an attack on unsupervised learning, contributing to future research by providing a model to use.Comment: 38 pages, 11 figure
    corecore