1,994 research outputs found

    A Survey on Privacy in Human Mobility

    Get PDF
    In the last years we have witnessed a pervasive use of location-aware technologies such as vehicular GPS-enabled devices, RFID based tools, mobile phones, etc which generate collection and storing of a large amount of human mobility data. The powerful of this data has been recognized by both the scientific community and the industrial worlds. Human mobility data can be used for different scopes such as urban traffic management, urban planning, urban pollution estimation, etc. Unfortunately, data describing human mobility is sensitive, because people’s whereabouts may allow re-identification of individuals in a de-identified database and the access to the places visited by individuals may enable the inference of sensitive information such as religious belief, sexual preferences, health conditions, and so on. The literature reports many approaches aimed at overcoming privacy issues in mobility data, thus in this survey we discuss the advancements on privacy-preserving mobility data publishing. We first describe the adversarial attack and privacy models typically taken into consideration for mobility data, then we present frameworks for the privacy risk assessment and finally, we discuss three main categories of privacy-preserving strategies: methods based on anonymization of mobility data, methods based on the differential privacy models and methods which protect privacy by exploiting generative models for synthetic trajectory generation

    A survey on privacy in human mobility

    Get PDF
    In the last years we have witnessed a pervasive use of location-aware technologies such as vehicular GPS-enabled devices, RFID based tools, mobile phones, etc which generate collection and storing of a large amount of human mobility data. The powerful of this data has been recognized by both the scientific community and the industrial worlds. Human mobility data can be used for different scopes such as urban traffic management, urban planning, urban pollution estimation, etc. Unfortunately, data describing human mobility is sensitive, because people's whereabouts may allow re-identification of individuals in a de-identified database and the access to the places visited by indi-viduals may enable the inference of sensitive information such as religious belief, sexual preferences, health conditions, and so on. The literature reports many approaches aimed at overcoming privacy issues in mobility data, thus in this survey we discuss the advancements on privacy-preserving mo-bility data publishing. We first describe the adversarial attack and privacy models typically taken into consideration for mobility data, then we present frameworks for the privacy risk assessment and finally, we discuss three main categories of privacy-preserving strategies: methods based on anonymization of mobility data, methods based on the differential privacy models and methods which protect privacy by exploiting generative models for synthetic trajectory generation

    Privacy-preserving data sharing infrastructures for medical research: systematization and comparison

    Get PDF
    Background: Data sharing is considered a crucial part of modern medical research. Unfortunately, despite its advantages, it often faces obstacles, especially data privacy challenges. As a result, various approaches and infrastructures have been developed that aim to ensure that patients and research participants remain anonymous when data is shared. However, privacy protection typically comes at a cost, e.g. restrictions regarding the types of analyses that can be performed on shared data. What is lacking is a systematization making the trade-offs taken by different approaches transparent. The aim of the work described in this paper was to develop a systematization for the degree of privacy protection provided and the trade-offs taken by different data sharing methods. Based on this contribution, we categorized popular data sharing approaches and identified research gaps by analyzing combinations of promising properties and features that are not yet supported by existing approaches. Methods: The systematization consists of different axes. Three axes relate to privacy protection aspects and were adopted from the popular Five Safes Framework: (1) safe data, addressing privacy at the input level, (2) safe settings, addressing privacy during shared processing, and (3) safe outputs, addressing privacy protection of analysis results. Three additional axes address the usefulness of approaches: (4) support for de-duplication, to enable the reconciliation of data belonging to the same individuals, (5) flexibility, to be able to adapt to different data analysis requirements, and (6) scalability, to maintain performance with increasing complexity of shared data or common analysis processes. Results: Using the systematization, we identified three different categories of approaches: distributed data analyses, which exchange anonymous aggregated data, secure multi-party computation protocols, which exchange encrypted data, and data enclaves, which store pooled individual-level data in secure environments for access for analysis purposes. We identified important research gaps, including a lack of approaches enabling the de-duplication of horizontally distributed data or providing a high degree of flexibility. Conclusions: There are fundamental differences between different data sharing approaches and several gaps in their functionality that may be interesting to investigate in future work. Our systematization can make the properties of privacy-preserving data sharing infrastructures more transparent and support decision makers and regulatory authorities with a better understanding of the trade-offs taken

    User's Privacy in Recommendation Systems Applying Online Social Network Data, A Survey and Taxonomy

    Full text link
    Recommender systems have become an integral part of many social networks and extract knowledge from a user's personal and sensitive data both explicitly, with the user's knowledge, and implicitly. This trend has created major privacy concerns as users are mostly unaware of what data and how much data is being used and how securely it is used. In this context, several works have been done to address privacy concerns for usage in online social network data and by recommender systems. This paper surveys the main privacy concerns, measurements and privacy-preserving techniques used in large-scale online social networks and recommender systems. It is based on historical works on security, privacy-preserving, statistical modeling, and datasets to provide an overview of the technical difficulties and problems associated with privacy preserving in online social networks.Comment: 26 pages, IET book chapter on big data recommender system

    Innovative Verfahren für die standortübergreifende Datennutzung in der medizinischen Forschung

    Get PDF
    Implementing modern data-driven medical research approaches ("Artificial intelligence", "Data Science") requires access to large amounts of data ("Big Data"). Typically, this can only be achieved through cross-institutional data use and exchange ("Data Sharing"). In this process, the protection of the privacy of patients and probands affected is a central challenge. Various methods can be used to meet this challenge, such as anonymization or federation. However, data sharing is currently put into practice only to a limited extent, although it is demanded and promoted from many sides. One reason for this is the lack of clarity about the advantages and disadvantages of different data sharing approaches. The first goal of this thesis was to develop an instrument that makes these advantages and disadvantages more transparent. The instrument systematizes approaches based on two dimensions - utility and protection - where each dimension is further differentiated with three axes describing different aspects of the dimensions, such as the degree of privacy protection provided by the results of performed analyses or the flexibility of a platform regarding the types of analyses that can be performed. The instrument was used for evaluation purposes to analyze the status quo and to identify gaps and potentials for innovative approaches. Next, and as a second goal, an innovative tool for the practical use of cryptographic data sharing methods has been designed and implemented. So far, such approaches are only rarely used in practice due to two main obstacles: (1) the technical complexity of setting up a cryptography-based data sharing infrastructure and (2) a lack of user-friendliness of cryptographic data sharing methods, especially for medical researchers. The tool EasySMPC, which was developed as part of this work, is characterized by the fact that it allows cryptographically secure computation of sums (e.g., frequencies of diagnoses) across institutional boundaries based on an easy-to-use graphical user interface. Neither technical expertise nor the deployment of specific infrastructure components is necessary for its practical use. The practicability of EasySMPC was analyzed experimentally in a detailed performance evaluation.Moderne datengetriebene medizinische Forschungsansätze („Künstliche Intelligenz“, „Data Science“) benötigen große Datenmengen („Big Data“). Dies kann im Regelfall nur durch eine institutionsübergreifende Datennutzung erreicht werden („Data Sharing“). Datenschutz und der Schutz der Privatsphäre der Betroffenen ist dabei eine zentrale Herausforderung. Um dieser zu begegnen, können verschiedene Methoden, wie etwa Anonymisierungsverfahren oder föderierte Auswertungen, eingesetzt werden. Allerdings findet Data Sharing in der Praxis nur selten statt, obwohl es von vielen Seiten gefordert und gefördert wird. Ein Grund hierfür ist die Unklarheit ¸über Vor- und Nachteile verschiedener Data Sharing-Ansätze. Erstes Ziel dieser Arbeit war es, ein Instrument zu entwickeln, welches diese Vor- und Nachteile transparent macht. Das Instrument bewertet Ansätze anhand von zwei Dimensionen - Nutzen und Schutz - wobei jede Dimension mit drei Achsen weiter differenziert ist. Die Achsen bestehen etwa aus dem Grad des Schutzes der Privatsphäre, der durch die Ergebnisse der durchgeführten Analysen gewährleistet wird oder der Flexibilität einer Plattform hinsichtlich der Arten von Analysen, die durchgeführt werden können. Das Instrument wurde zu Evaluationszwecken für die Analyse des Status Quo sowie zur Identifikation von Lücken und Potenzialen für innovative Verfahren eingesetzt. Als zweites Ziel wurde anschließend ein innovatives Werkzeug für den praktischen Einsatz von kryptographischen Data Sharing-Verfahren entwickelt. Der Einsatz entsprechender Ansätze scheitert bisher vor allem an zwei Barrieren: (1) der technischen Komplexität beim Aufbau einer Kryptographie-basierten Data Sharing-Infrastruktur und (2) der Benutzerfreundlichkeit kryptographischer Data Sharing-Verfahren, insbesondere für medizinische Forschende. Das neue Werkzeug EasySMPC zeichnet sich dadurch aus, dass es eine kryptographisch sichere Berechnung von Summen (beispielsweise Häufigkeiten von Diagnosen) über Institutionsgrenzen hinweg auf Basis einer einfach zu bedienenden graphischen Benutzeroberfläche ermöglicht. Zur Anwendung ist weder technische Expertise noch der Aufbau spezieller Infrastrukturkomponenten notwendig. Die Praxistauglichkeit von EasySMPC wurde in einer ausführlichen Performance-Evaluation experimentell analysiert

    Semantic-Based Policy Composition for Privacy-Demanding Data Linkage

    Get PDF
    Record linkage can be used to support current and future health research across populations however such approaches give rise to many challenges related to patient privacy and confidentiality including inference attacks. To address this, we present a semantic-based policy framework where linkage privacy detects attribute associations that can lead to inference disclosure issues. To illustrate the effectiveness of the approach, we present a case study exploring health data combining spatial, ethnicity and language information from several major on-going projects occurring across Australia. Compared with classic access control models, the results show that our proposal outperforms other approaches with regards to effectiveness, reliability and subsequent data utility

    Spatio-Temporal Linkage over Location-Enhanced Services

    Full text link
    corecore