10 research outputs found

    Predicting and Explaining Privacy Risk Exposure in Mobility Data

    Get PDF
    Mobility data is a proxy of different social dynamics and its analysis enables a wide range of user services. Unfortunately, mobility data are very sensitive because the sharing of people’s whereabouts may arise serious privacy concerns. Existing frameworks for privacy risk assessment provide tools to identify and measure privacy risks, but they often (i) have high computational complexity; and (ii) are not able to provide users with a justification of the reported risks. In this paper, we propose expert, a new framework for the prediction and explanation of privacy risk on mobility data. We empirically evaluate privacy risk on real data, simulating a privacy attack with a state-of-the-art privacy risk assessment framework. We then extract individual mobility profiles from the data for predicting their risk. We compare the performance of several machine learning algorithms in order to identify the best approach for our task. Finally, we show how it is possible to explain privacy risk prediction on real data, using two algorithms: Shap, a feature importance-based method and Lore, a rule-based method. Overall, expert is able to provide a user with the privacy risk and an explanation of the risk itself. The experiments show excellent performance for the prediction task

    Big Data Research in Italy: A Perspective

    Get PDF
    The aim of this article is to synthetically describe the research projects that a selection of Italian universities is undertaking in the context of big data. Far from being exhaustive, this article has the objective of offering a sample of distinct applications that address the issue of managing huge amounts of data in Italy, collected in relation to diverse domains

    Privacy by Design in Distributed Mobility Data

    Get PDF
    Movement data are sensitive, because people’s whereabouts may allow re- identification of individuals in a de-identified database and thus can poten- tially reveal intimate personal traits, such as religious or sexual preferences. In this thesis, we focus on a distributed setting in which movement data from individual vehicles are collected and aggregated by a centralized station. We propose a novel approach to privacy-preserving analytical processing within such a distributed setting, and tackle the problem of obtaining aggregated traffic information while preventing privacy leakage from data collection and aggregation. We study and analyze three different solutions based on the differential privacy model and on sketching techniques for efficient data compression. Each solution achieves different a trade-off between privacy protection and utility of the transformed data. Using real-life data, we demonstrate the effectiveness of our approaches in terms of data utility preserved by the data transformation, thus bringing empirical evidence to the fact that the privacy-by-design paradigm in big data analysis has the potential of delivering high data protection combined with high quality even in massively distributed techno-social systems

    Privacy protection in location based services

    Get PDF
    This thesis takes a multidisciplinary approach to understanding the characteristics of Location Based Services (LBS) and the protection of location information in these transactions. This thesis reviews the state of the art and theoretical approaches in Regulations, Geographic Information Science, and Computer Science. Motivated by the importance of location privacy in the current age of mobile devices, this thesis argues that failure to ensure privacy protection under this context is a violation to human rights and poses a detriment to the freedom of users as individuals. Since location information has unique characteristics, existing methods for protecting other type of information are not suitable for geographical transactions. This thesis demonstrates methods that safeguard location information in location based services and that enable geospatial analysis. Through a taxonomy, the characteristics of LBS and privacy techniques are examined and contrasted. Moreover, mechanisms for privacy protection in LBS are presented and the resulting data is tested with different geospatial analysis tools to verify the possibility of conducting these analyses even with protected location information. By discussing the results and conclusions of these studies, this thesis provides an agenda for the understanding of obfuscated geospatial data usability and the feasibility to implement the proposed mechanisms in privacy concerning LBS, as well as for releasing crowdsourced geographic information to third-parties

    Building and evaluating privacy-preserving data processing systems

    Get PDF
    Large-scale data processing prompts a number of important challenges, including guaranteeing that collected or published data is not misused, preventing disclosure of sensitive information, and deploying privacy protection frameworks that support usable and scalable services. In this dissertation, we study and build systems geared for privacy-friendly data processing, enabling computational scenarios and applications where potentially sensitive data can be used to extract useful knowledge, and which would otherwise be impossible without such strong privacy guarantees. For instance, we show how to privately and efficiently aggregate data from many sources and large streams, and how to use the aggregates to extract useful statistics and train simple machine learning models. We also present a novel technique for privately releasing generative machine learning models and entire high-dimensional datasets produced by these models. Finally, we demonstrate that the data used by participants in training generative and collaborative learning models may be vulnerable to inference attacks and discuss possible mitigation strategies

    Privacy-preserving distributed movement data aggregation

    No full text
    We propose a novel approach to privacy-preserving analytical processing within a distributed setting, and tackle the problem of obtaining aggregated information about vehicle traffic in a city from movement data collected by individual vehicles and shipped to a central server. Movement data are sensitive because peopleâs whereabouts have the potential to reveal intimate personal traits, such as religious or sexual preferences, and may allow re-identification of individuals in a database. We provide a privacy-preserving framework for movement data aggregation based on trajectory generalization in a distributed environment. The proposed solution, based on the differential privacy model and on sketching techniques for efficient data compression, provides a formal data protection safeguard. Using real-life data, we demonstrate the effectiveness of our approach also in terms of data utility preserved by the data transformation
    corecore