164 research outputs found

    Injecting Uncertainty in Graphs for Identity Obfuscation

    Full text link
    Data collected nowadays by social-networking applications create fascinating opportunities for building novel services, as well as expanding our understanding about social structures and their dynamics. Unfortunately, publishing social-network graphs is considered an ill-advised practice due to privacy concerns. To alleviate this problem, several anonymization methods have been proposed, aiming at reducing the risk of a privacy breach on the published data, while still allowing to analyze them and draw relevant conclusions. In this paper we introduce a new anonymization approach that is based on injecting uncertainty in social graphs and publishing the resulting uncertain graphs. While existing approaches obfuscate graph data by adding or removing edges entirely, we propose using a finer-grained perturbation that adds or removes edges partially: this way we can achieve the same desired level of obfuscation with smaller changes in the data, thus maintaining higher utility. Our experiments on real-world networks confirm that at the same level of identity obfuscation our method provides higher usefulness than existing randomized methods that publish standard graphs.Comment: VLDB201

    On the Privacy and Utility of Social Networks

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Generating and sharing differentially private spatio-temporal data using real-world knowledge

    Get PDF
    Privacy-preserving spatio-temporal data sharing is vital for addressing many real-world problems, such as managing disease spread or tailoring public services to a population’s travel patterns. Differential privacy has become the de facto privacy standard owing to its strong privacy guarantees, although existing mechanisms make very restrictive assumptions regarding what outside knowledge is known beyond the data itself. .is limits the practical utility of the private data, and has prevented the widespread deployment of differentially private algorithms in the real world. . This thesis aims to show that incorporating publicly available information, such as the road network or characteristics of places of interests, can enhance the practical utility of the output data without negatively affecting privacy. .This thesis focuses on two main problems, both of which are fundamental in enabling location analytics with private data. The first considers the synthesis of spatial point data, and three solutions are proposed. The first solution uses a private adaptation of kernel density estimation to generate data within small private partitions, and the second uses the road network as the basis for data generation. The third solution combines randomised response with generative adversarial networks to develop a generative model that satisfies label local differential privacy – a more practical and realistic privacy setting. The second problem focuses on sharing trajectory data using local differential privacy. .e proposed solution uses the exponential mechanism to efficiently perturb overlapping, hierarchically structured =-grams of trajectory data, which help to preserve the spatio-temporal correlations inherent in trajectory data. .is problem, and its solution, is then extended to a setting in which two services wish to privately share event sequence data with each other. All solutions incorporate publicly available external knowledge by imposing hard constraints on feasible outputs, exploiting the intrinsic hierarchies and underlying structures of realworld data, and using distance functions to ensure that semantically similar values are more likely to be output. Experiments with real data show that including this information helps to produce private data that performs very well in many spatio-temporal analytical tasks, including range, hotspot, and facility location queries. These strong results demonstrate the potential for more widespread use of differential privacy in the real world

    Privacy in rfid and mobile objects

    Get PDF
    Los sistemas RFID permiten la identificación rápida y automática de etiquetas RFID a través de un canal de comunicación inalámbrico. Dichas etiquetas son dispositivos con cierto poder de cómputo y capacidad de almacenamiento de información. Es por ello que los objetos que contienen una etiqueta RFID adherida permiten la lectura de una cantidad rica y variada de datos que los describen y caracterizan, por ejemplo, un código único de identificación, el nombre, el modelo o la fecha de expiración. Además, esta información puede ser leída sin la necesidad de un contacto visual entre el lector y la etiqueta, lo cual agiliza considerablemente los procesos de inventariado, identificación, o control automático. Para que el uso de la tecnología RFID se generalice con éxito, es conveniente cumplir con varios objetivos: eficiencia, seguridad y protección de la privacidad. Sin embargo, el diseño de protocolos de identificación seguros, privados, y escalables es un reto difícil de abordar dada las restricciones computacionales de las etiquetas RFID y su naturaleza inalámbrica. Es por ello que, en la presente tesis, partimos de protocolos de identificación seguros y privados, y mostramos cómo se puede lograr escalabilidad mediante una arquitectura distribuida y colaborativa. De este modo, la seguridad y la privacidad se alcanzan mediante el propio protocolo de identificación, mientras que la escalabilidad se logra por medio de novedosos métodos colaborativos que consideran la posición espacial y temporal de las etiquetas RFID. Independientemente de los avances en protocolos inalámbricos de identificación, existen ataques que pueden superar exitosamente cualquiera de estos protocolos sin necesidad de conocer o descubrir claves secretas válidas ni de encontrar vulnerabilidades en sus implementaciones criptográficas. La idea de estos ataques, conocidos como ataques de “relay”, consiste en crear inadvertidamente un puente de comunicación entre una etiqueta legítima y un lector legítimo. De este modo, el adversario usa los derechos de la etiqueta legítima para pasar el protocolo de autenticación usado por el lector. Nótese que, dada la naturaleza inalámbrica de los protocolos RFID, este tipo de ataques representa una amenaza importante a la seguridad en sistemas RFID. En esta tesis proponemos un nuevo protocolo que además de autenticación realiza un chequeo de la distancia a la cual se encuentran el lector y la etiqueta. Este tipo de protocolos se conocen como protocolos de acotación de distancia, los cuales no impiden este tipo de ataques, pero sí pueden frustrarlos con alta probabilidad. Por último, afrontamos los problemas de privacidad asociados con la publicación de información recogida a través de sistemas RFID. En particular, nos concentramos en datos de movilidad que también pueden ser proporcionados por otros sistemas ampliamente usados tales como el sistema de posicionamiento global (GPS) y el sistema global de comunicaciones móviles. Nuestra solución se basa en la conocida noción de k-anonimato, alcanzada mediante permutaciones y microagregación. Para este fin, definimos una novedosa función de distancia entre trayectorias con la cual desarrollamos dos métodos diferentes de anonimización de trayectorias.Els sistemes RFID permeten la identificació ràpida i automàtica d’etiquetes RFID a través d’un canal de comunicació sense fils. Aquestes etiquetes són dispositius amb cert poder de còmput i amb capacitat d’emmagatzematge de informació. Es per això que els objectes que porten una etiqueta RFID adherida permeten la lectura d’una quantitat rica i variada de dades que els descriuen i caracteritzen, com per exemple un codi únic d’identificació, el nom, el model o la data d’expiració. A més, aquesta informació pot ser llegida sense la necessitat d’un contacte visual entre el lector i l’etiqueta, la qual cosa agilitza considerablement els processos d’inventariat, identificació o control automàtic. Per a que l’ús de la tecnologia RFID es generalitzi amb èxit, es convenient complir amb diversos objectius: eficiència, seguretat i protecció de la privacitat. No obstant això, el disseny de protocols d’identificació segurs, privats i escalables, es un repte difícil d’abordar dades les restriccions computacionals de les etiquetes RFID i la seva naturalesa sense fils. Es per això que, en la present tesi, partim de protocols d’identificació segurs i privats, i mostrem com es pot aconseguir escalabilitat mitjançant una arquitectura distribuïda i col•laborativa. D’aquesta manera, la seguretat i la privacitat s’aconsegueixen mitjançant el propi protocol d’identificació, mentre que l’escalabilitat s’aconsegueix per mitjà de nous protocols col•laboratius que consideren la posició espacial i temporal de les etiquetes RFID. Independentment dels avenços en protocols d’identificació sense fils, existeixen atacs que poden passar exitosament qualsevol d’aquests protocols sense necessitat de conèixer o descobrir claus secretes vàlides, ni de trobar vulnerabilitats a les seves implantacions criptogràfiques. La idea d’aquestos atacs, coneguts com atacs de “relay”, consisteix en crear inadvertidament un pont de comunicació entre una etiqueta legítima i un lector legítim. D’aquesta manera, l’adversari utilitza els drets de l’etiqueta legítima per passar el protocol d’autentificació utilitzat pel lector. Es important tindre en compte que, dada la naturalesa sense fils dels protocols RFID, aquests tipus d’atacs representen una amenaça important a la seguretat en sistemes RFID. En aquesta dissertació proposem un nou protocol que, a més d’autentificació, realitza una revisió de la distància a la qual es troben el lector i l’etiqueta. Aquests tipus de protocols es coneixen com a “distance-boulding protocols”, els quals no prevenen aquests tipus d’atacs, però si que poden frustrar-los amb alta probabilitat. Per últim, afrontem els problemes de privacitat associats amb la publicació de informació recol•lectada a través de sistemes RFID. En concret, ens concentrem en dades de mobilitat, que també poden ser proveïdes per altres sistemes àmpliament utilitzats tals com el sistema de posicionament global (GPS) i el sistema global de comunicacions mòbils. La nostra solució es basa en la coneguda noció de privacitat “k-anonymity” i parcialment en micro-agregació. Per a aquesta finalitat, definim una nova funció de distància entre trajectòries amb la qual desenvolupen dos mètodes diferents d’anonimització de trajectòries.Radio Frequency Identification (RFID) is a technology aimed at efficiently identifying and tracking goods and assets. Such identification may be performed without requiring line-of-sight alignment or physical contact between the RFID tag and the RFID reader, whilst tracking is naturally achieved due to the short interrogation field of RFID readers. That is why the reduction in price of the RFID tags has been accompanied with an increasing attention paid to this technology. However, since tags are resource-constrained devices sending identification data wirelessly, designing secure and private RFID identification protocols is a challenging task. This scenario is even more complex when scalability must be met by those protocols. Assuming the existence of a lightweight, secure, private and scalable RFID identification protocol, there exist other concerns surrounding the RFID technology. Some of them arise from the technology itself, such as distance checking, but others are related to the potential of RFID systems to gather huge amount of tracking data. Publishing and mining such moving objects data is essential to improve efficiency of supervisory control, assets management and localisation, transportation, etc. However, obvious privacy threats arise if an individual can be linked with some of those published trajectories. The present dissertation contributes to the design of algorithms and protocols aimed at dealing with the issues explained above. First, we propose a set of protocols and heuristics based on a distributed architecture that improve the efficiency of the identification process without compromising privacy or security. Moreover, we present a novel distance-bounding protocol based on graphs that is extremely low-resource consuming. Finally, we present two trajectory anonymisation methods aimed at preserving the individuals' privacy when their trajectories are released

    Secrecy and performance models for query processing on outsourced graph data

    Get PDF
    Database outsourcing is a challenge concerning data secrecy. Even if an adversary, including the service provider, accesses the data, she should not be able to learn any information from the accessed data. In this paper, we address this problem for graph-structured data. First, we define a secrecy notion for graph-structured data based on the concepts of indistinguishability and searchable encryption. To address this problem, we propose an approach based on bucketization. Next to bucketization, it makes use of obfuscated indexes and encryption. We show that finding an optimal bucketization tailored to graph-structured data is NP-hard; therefore, we come up with a heuristic. We prove that the proposed bucketization approach fulfills our secrecy notion. In addition, we present a performance model for scale-free networks which consists of (1) a number-of-buckets model that estimates the number of buckets obtained after applying our bucketization approach and (2) a query-cost model. Finally, we demonstrate with a set of experiments the accuracy of our number-of-buckets model and the efficiency of our approach with respect to query processing

    Asymmetric structurepreserving subgraph query for large graphs

    Get PDF
    Abstract-One fundamental type of query for graph databases is subgraph isomorphism queries (a.k.a subgraph queries). Due to the computational hardness of subgraph queries coupled with the cost of managing massive graph data, outsourcing the query computation to a third-party service provider has been an economical and scalable approach. However, confidentiality is known to be an important attribute of Quality of Service (QoS) in Query as a Service (QaaS). In this paper, we propose the first practical private approach for subgraph query services, asymmetric structure-preserving subgraph query processing, where the data graph is publicly known and the query structure/topology is kept secret. Unlike other previous methods for subgraph queries, this paper proposes a series of novel optimizations that only exploit graph structures, not the queries. Further, we propose a robust query encoding and adopt the novel cyclic group based encryption so that query processing is transformed into a series of private matrix operations. Our experiments confirm that our techniques are efficient and the optimizations are effective

    SoK: differentially private publication of trajectory data

    Get PDF
    Trajectory analysis holds many promises, from improvements in traffic management to routing advice or infrastructure development. However, learning users’ paths is extremely privacy-invasive. Therefore, there is a necessity to protect trajectories such that we preserve the global properties, useful for analysis, while specific and private information of individuals remains inaccessible. Trajectories, however, are difficult to protect, since they are sequential, highly dimensional, correlated, bound to geophysical restrictions, and easily mapped to semantic points of interest. This paper aims to establish a systematic framework on protective masking measures for trajectory databases with differentially private (DP) guarantees, including also utility properties, derived from ideas and limitations of existing proposals. To reach this goal, we systematize the utility metrics used throughout the literature, deeply analyze the DP granularity notions, explore and elaborate on the state of the art on privacy-enhancing mechanisms and their problems, and expose the main limitations of DP notions in the context of trajectories.We would like to thank the reviewers and shepherd for their useful comments and suggestions in the improvement of this paper. Javier Parra-Arnau is the recipient of a “Ramón y Cajal” fellowship funded by the Spanish Ministry of Science and Innovation. This work also received support from “la Caixa” Foundation (fellowship code LCF/BQ/PR20/11770009), the European Union’s H2020 program (Marie Skłodowska-Curie grant agreement № 847648) from the Government of Spain under the project “COMPROMISE” (PID2020-113795RB-C31/AEI/10.13039/501100011033), and from the BMBF project “PROPOLIS” (16KIS1393K). The authors at KIT are supported by KASTEL Security Research Labs (Topic 46.23 of the Helmholtz Association) and Germany’s Excellence Strategy (EXC 2050/1 ‘CeTI’; ID 390696704).Peer ReviewedPostprint (published version

    SoK: Differentially Private Publication of Trajectory Data

    Get PDF
    Trajectory analysis holds many promises, from improvements in traffic management to routing advice or infrastructure development. However, learning users\u27 paths is extremely privacy-invasive. Therefore, there is a necessity to protect trajectories such that we preserve the global properties, useful for analysis, while specific and private information of individuals remains inaccessible. Trajectories, however, are difficult to protect, since they are sequential, highly dimensional, correlated, bound to geophysical restrictions, and easily mapped to semantic points of interest. This paper aims to establish a systematic framework on protective masking and synthetic-generation measures for trajectory databases with syntactic and differentially private (DP) guarantees, including also utility properties, derived from ideas and limitations of existing proposals. To reach this goal, we systematize the utility metrics used throughout the literature, deeply analyze the DP granularity notions, explore and elaborate on the state of the art on privacy-enhancing mechanisms and their problems, and expose the main limitations of DP notions in the context of trajectories
    corecore