    Quantifying Privacy Loss of Human Mobility Graph Topology

    Human mobility is often represented as a mobility network, or graph, with nodes representing places of significance which an individual visits, such as their home, work, places of social amenity, etc., and edge weights corresponding to probability estimates of movements between these places. Previous research has shown that individuals can be identified by a small number of geolocated nodes in their mobility network, rendering mobility trace anonymization a hard task. In this paper we build on prior work and demonstrate that even when all location and timestamp information is removed from nodes, the graph topology of an individual mobility network itself is often uniquely identifying. Further, we observe that a mobility network is often unique, even when only a small number of the most popular nodes and edges are considered. We evaluate our approach using a large dataset of cell-tower location traces from 1 500 smartphone handsets with a mean duration of 430 days. We process the data to derive the top−N places visited by the device in the trace, and find that 93% of traces have a unique top−10 mobility network, and all traces are unique when considering top−15 mobility networks. Since mobility patterns, and therefore mobility networks for an individual, vary over time, we use graph kernel distance functions, to determine whether two mobility networks, taken at different points in time, represent the same individual. We then show that our distance metrics, while imperfect predictors, perform significantly better than a random strategy and therefore our approach represents a significant loss in privacy.

    The Effects of Ant Colony Optimization on Graph Anonymization

    The growing need to address privacy concerns whensocial network data is released for mining purposes hasrecently led to considerable interest in varioustechniques for graph anonymization. These techniquesand definitions, although robust are sometimes difficultto achieve for large social net-works. In this paper, welook at applying ant colony opti-mization (ACO) to twoknown versions of social network anonymization,namely k-label sequence anonymity, known to be NPhardfor k ≥ 3. We also apply it to the more recent workof [23] and Label Bag Anonymization. Ants of the artificialcolony are able to generate successively shortertours by using information accumulated in the form ofpheromone trails deposited by the edge colonies ant.Computer simu-lations have indicated that ACO arecapable of generating good solutions for known hardergraph problems.The contributions of this paper are two fold: welook to apply ACO to k-label sequence anonymity andk=label bag based anonymization, and attempt to showthe power of ap-plying ACO techniques to socialnetwork privacy attempts. Furthermore, we look tobuild a new novel foundation of study, that althoughat its preliminary stages, can lead it ground breakingresults down the road

    Human mobility is often represented as a mobility network, or graph, with nodes representing places of significance which an individual visits, such as their home, work, places of social amenity, etc., and edge weights corresponding to probability estimates of movements between these places. Previous research has shown that individuals can be identified by a small number of geolocated nodes in their mobility network, rendering mobility trace anonymization a hard task. In this paper we build on prior work and demonstrate that even when all location and timestamp information is removed from nodes, the graph topology of an individual mobility network itself is often uniquely identifying. Further, we observe that a mobility network is often unique, even when only a small number of the most popular nodes and edges are considered. We evaluate our approach using a large dataset of cell-tower location traces from 1 500 smartphone handsets with a mean duration of 430 days. We process the data to derive the top−N places visited by the device in the trace, and find that 93% of traces have a unique top−10 mobility network, and all traces are unique when considering top−15 mobility networks. Since mobility patterns, and therefore mobility networks for an individual, vary over time, we use graph kernel distance functions, to determine whether two mobility networks, taken at different points in time, represent the same individual. We then show that our distance metrics, while imperfect predictors, perform significantly better than a random strategy and therefore our approach represents a significant loss in privacy

    Gene flow and genetic structure in the Galician population (NW Spain) according to Alu insertions

    Abstract

Background

The most recent Alu insertions reveal different degrees of polymorphism in human populations, and a series of characteristics that make them particularly suitable genetic markers for Human Biology studies. This has led these polymorphisms to be used to analyse the origin and phylogenetic relationships between contemporary human groups. This study analyses twelve Alu sequences in a sample of 216 individuals from the autochthonous population of Galicia (NW Spain), with the aim of studying their genetic structure and phylogenetic position with respect to the populations of Western and Central Europe and North Africa, research that is of special interest in revealing European population dynamics, given the peculiarities of the Galician population due to its geographical situation in western Europe, and its historical vicissitudes.

Results

The insertion frequencies of eleven of the Alu elements analysed were within the variability range of European populations, while Yb8NBC125 proved to be the lowest so far recorded to date in Europe.

Taking the twelve polymorphisms into account, the GD value for the Galician population was 0.268. The comparative analyses carried out using the MDS, NJ and AMOVA methods reveal the existence of spatial heterogeneity, and identify three population groups that correspond to the geographic areas of Western-Central Europe, Eastern Mediterranean Europe and North Africa. Galicia is shown to be included in the Western-Central European cluster, together with other Spanish populations. When only considering populations from Mediterranean Europe, the Galician population revealed a degree of genetic flow similar to that of the majority of the populations from this geographic area.

Conclusion

The results of this study reveal that the Galician population, despite its geographic situation in the western edge of the European continent, occupies an intermediate position in relation to other European populations in general, and Iberian populations in particular. This confirms the important role that migratory movements have had in the European gene pool, at least since Neolithic times. In turn, the MDS and NJ analyses place Galicia within the group comprised of Western-Central European populations, which is justified by the influence of Germanic peoples on the Galician population during the Middle Ages. However, it should also be noted that some of the markers analysed have a certain degree of differentiation, possibly due to the region's position as a 'cul-de-sac' in terms of Iberian population dynamics.

    Secure Data Collection and Analysis in Smart Health Monitoring

    Smart health monitoring uses real-time monitored data to support diagnosis, treatment, and health decision-making in modern smart healthcare systems and benefit our daily life. The accurate health monitoring and prompt transmission of health data are facilitated by the ever-evolving on-body sensors, wireless communication technologies, and wireless sensing techniques. Although the users have witnessed the convenience of smart health monitoring, severe privacy and security concerns on the valuable and sensitive collected data come along with the merit. The data collection, transmission, and analysis are vulnerable to various attacks, e.g., eavesdropping, due to the open nature of wireless media, the resource constraints of sensing devices, and the lack of security protocols. These deficiencies not only make conventional cryptographic methods not applicable in smart health monitoring but also put many obstacles in the path of designing privacy protection mechanisms. In this dissertation, we design dedicated schemes to achieve secure data collection and analysis in smart health monitoring. The first two works propose two robust and secure authentication schemes based on Electrocardiogram (ECG), which outperform traditional user identity authentication schemes in health monitoring, to restrict the access to collected data to legitimate users. To improve the practicality of ECG-based authentication, we address the nonuniformity and sensitivity of ECG signals, as well as the noise contamination issue. The next work investigates an extended authentication goal, denoted as wearable-user pair authentication. It simultaneously authenticates the user identity and device identity to provide further protection. We exploit the uniqueness of the interference between different wireless protocols, which is common in health monitoring due to devices\u27 varying sensing and transmission demands, and design a wearable-user pair authentication scheme based on the interference. However, the harm of this interference is also outstanding. Thus, in the fourth work, we use wireless human activity recognition in health monitoring as an example and analyze how this interference may jeopardize it. We identify a new attack that can produce false recognition result and discuss potential countermeasures against this attack. In the end, we move to a broader scenario and protect the statistics of distributed data reported in mobile crowd sensing, a common practice used in public health monitoring for data collection. We deploy differential privacy to enable the indistinguishability of workers\u27 locations and sensing data without the help of a trusted entity while meeting the accuracy demands of crowd sensing tasks

    Privacy in rfid and mobile objects

    Los sistemas RFID permiten la identificación rápida y automática de etiquetas RFID a través de un canal de comunicación inalámbrico. Dichas etiquetas son dispositivos con cierto poder de cómputo y capacidad de almacenamiento de información. Es por ello que los objetos que contienen una etiqueta RFID adherida permiten la lectura de una cantidad rica y variada de datos que los describen y caracterizan, por ejemplo, un código único de identificación, el nombre, el modelo o la fecha de expiración. Además, esta información puede ser leída sin la necesidad de un contacto visual entre el lector y la etiqueta, lo cual agiliza considerablemente los procesos de inventariado, identificación, o control automático. Para que el uso de la tecnología RFID se generalice con éxito, es conveniente cumplir con varios objetivos: eficiencia, seguridad y protección de la privacidad. Sin embargo, el diseño de protocolos de identificación seguros, privados, y escalables es un reto difícil de abordar dada las restricciones computacionales de las etiquetas RFID y su naturaleza inalámbrica. Es por ello que, en la presente tesis, partimos de protocolos de identificación seguros y privados, y mostramos cómo se puede lograr escalabilidad mediante una arquitectura distribuida y colaborativa. De este modo, la seguridad y la privacidad se alcanzan mediante el propio protocolo de identificación, mientras que la escalabilidad se logra por medio de novedosos métodos colaborativos que consideran la posición espacial y temporal de las etiquetas RFID. Independientemente de los avances en protocolos inalámbricos de identificación, existen ataques que pueden superar exitosamente cualquiera de estos protocolos sin necesidad de conocer o descubrir claves secretas válidas ni de encontrar vulnerabilidades en sus implementaciones criptográficas. La idea de estos ataques, conocidos como ataques de “relay”, consiste en crear inadvertidamente un puente de comunicación entre una etiqueta legítima y un lector legítimo. De este modo, el adversario usa los derechos de la etiqueta legítima para pasar el protocolo de autenticación usado por el lector. Nótese que, dada la naturaleza inalámbrica de los protocolos RFID, este tipo de ataques representa una amenaza importante a la seguridad en sistemas RFID. En esta tesis proponemos un nuevo protocolo que además de autenticación realiza un chequeo de la distancia a la cual se encuentran el lector y la etiqueta. Este tipo de protocolos se conocen como protocolos de acotación de distancia, los cuales no impiden este tipo de ataques, pero sí pueden frustrarlos con alta probabilidad. Por último, afrontamos los problemas de privacidad asociados con la publicación de información recogida a través de sistemas RFID. En particular, nos concentramos en datos de movilidad que también pueden ser proporcionados por otros sistemas ampliamente usados tales como el sistema de posicionamiento global (GPS) y el sistema global de comunicaciones móviles. Nuestra solución se basa en la conocida noción de k-anonimato, alcanzada mediante permutaciones y microagregación. Para este fin, definimos una novedosa función de distancia entre trayectorias con la cual desarrollamos dos métodos diferentes de anonimización de trayectorias.Els sistemes RFID permeten la identificació ràpida i automàtica d’etiquetes RFID a través d’un canal de comunicació sense fils. Aquestes etiquetes són dispositius amb cert poder de còmput i amb capacitat d’emmagatzematge de informació. Es per això que els objectes que porten una etiqueta RFID adherida permeten la lectura d’una quantitat rica i variada de dades que els descriuen i caracteritzen, com per exemple un codi únic d’identificació, el nom, el model o la data d’expiració. A més, aquesta informació pot ser llegida sense la necessitat d’un contacte visual entre el lector i l’etiqueta, la qual cosa agilitza considerablement els processos d’inventariat, identificació o control automàtic. Per a que l’ús de la tecnologia RFID es generalitzi amb èxit, es convenient complir amb diversos objectius: eficiència, seguretat i protecció de la privacitat. No obstant això, el disseny de protocols d’identificació segurs, privats i escalables, es un repte difícil d’abordar dades les restriccions computacionals de les etiquetes RFID i la seva naturalesa sense fils. Es per això que, en la present tesi, partim de protocols d’identificació segurs i privats, i mostrem com es pot aconseguir escalabilitat mitjançant una arquitectura distribuïda i col•laborativa. D’aquesta manera, la seguretat i la privacitat s’aconsegueixen mitjançant el propi protocol d’identificació, mentre que l’escalabilitat s’aconsegueix per mitjà de nous protocols col•laboratius que consideren la posició espacial i temporal de les etiquetes RFID. Independentment dels avenços en protocols d’identificació sense fils, existeixen atacs que poden passar exitosament qualsevol d’aquests protocols sense necessitat de conèixer o descobrir claus secretes vàlides, ni de trobar vulnerabilitats a les seves implantacions criptogràfiques. La idea d’aquestos atacs, coneguts com atacs de “relay”, consisteix en crear inadvertidament un pont de comunicació entre una etiqueta legítima i un lector legítim. D’aquesta manera, l’adversari utilitza els drets de l’etiqueta legítima per passar el protocol d’autentificació utilitzat pel lector. Es important tindre en compte que, dada la naturalesa sense fils dels protocols RFID, aquests tipus d’atacs representen una amenaça important a la seguretat en sistemes RFID. En aquesta dissertació proposem un nou protocol que, a més d’autentificació, realitza una revisió de la distància a la qual es troben el lector i l’etiqueta. Aquests tipus de protocols es coneixen com a “distance-boulding protocols”, els quals no prevenen aquests tipus d’atacs, però si que poden frustrar-los amb alta probabilitat. Per últim, afrontem els problemes de privacitat associats amb la publicació de informació recol•lectada a través de sistemes RFID. En concret, ens concentrem en dades de mobilitat, que també poden ser proveïdes per altres sistemes àmpliament utilitzats tals com el sistema de posicionament global (GPS) i el sistema global de comunicacions mòbils. La nostra solució es basa en la coneguda noció de privacitat “k-anonymity” i parcialment en micro-agregació. Per a aquesta finalitat, definim una nova funció de distància entre trajectòries amb la qual desenvolupen dos mètodes diferents d’anonimització de trajectòries.Radio Frequency Identification (RFID) is a technology aimed at efficiently identifying and tracking goods and assets. Such identification may be performed without requiring line-of-sight alignment or physical contact between the RFID tag and the RFID reader, whilst tracking is naturally achieved due to the short interrogation field of RFID readers. That is why the reduction in price of the RFID tags has been accompanied with an increasing attention paid to this technology. However, since tags are resource-constrained devices sending identification data wirelessly, designing secure and private RFID identification protocols is a challenging task. This scenario is even more complex when scalability must be met by those protocols. Assuming the existence of a lightweight, secure, private and scalable RFID identification protocol, there exist other concerns surrounding the RFID technology. Some of them arise from the technology itself, such as distance checking, but others are related to the potential of RFID systems to gather huge amount of tracking data. Publishing and mining such moving objects data is essential to improve efficiency of supervisory control, assets management and localisation, transportation, etc. However, obvious privacy threats arise if an individual can be linked with some of those published trajectories. The present dissertation contributes to the design of algorithms and protocols aimed at dealing with the issues explained above. First, we propose a set of protocols and heuristics based on a distributed architecture that improve the efficiency of the identification process without compromising privacy or security. Moreover, we present a novel distance-bounding protocol based on graphs that is extremely low-resource consuming. Finally, we present two trajectory anonymisation methods aimed at preserving the individuals' privacy when their trajectories are released

    Privacy Preserved Model Based Approaches for Generating Open Travel Behavioural Data

    Location-aware technologies and smart phones are fast growing in usage and adoption as a medium of service request and delivery of daily activities. However, the increasing usage of these technologies has birthed new challenges that needs to be addressed. Privacy protection and the need for disaggregate mobility data for transportation modelling needs to be balanced for applications and academic research. This dissertation focuses on developing modern privacy mechanisms that seek to satisfy requirements on privacy and data utility for fine-grained travel behavioural modelling applications using large-scale mobility data. To accomplish this, we review the challenges and opportunities that are needed to be solved in order to harness the full potential of “Big Transportation Data”. Also, we perform a quantitative evaluation on the degree of privacy that are provided by popular location anonymization techniques when undertaken on sensitive location data (i.e. homes, offices) of a travel survey. As a step to solve the trade-off between privacy and utility, we develop a differentially-private generative model for simultaneously synthesizing both socio-economic attributes and sequences of activity diary. Adversarial attack models are proposed and tested to evaluate the effectiveness of the proposed system against privacy attacks. The results show that datasets from the developed privacy enhancing system can be used for travel behavioural modelling with satisfactory results while ensuring an acceptable level of privacy

    A personal route prediction system based on trajectory data mining

    This paper presents a system where the personal route of a user is predicted using a probabilistic model built from the historical trajectory data. Route patterns are extracted from personal trajectory data using a novel mining algorithm, Continuous Route Pattern Mining (CRPM), which can tolerate different kinds of disturbance in trajectory data. Furthermore, a client–server architecture is employed which has the dual purpose of guaranteeing the privacy of personal data and greatly reducing the computational load on mobile devices. An evaluation using a corpus of trajectory data from 17 people demonstrates that CRPM can extract longer route patterns than current methods. Moreover, the average correct rate of one step prediction of our system is greater than 71%, and the average Levenshtein distance of continuous route prediction of our system is about 30% shorter than that of the Markov model based method

    Active Re-identification Attacks on Periodically Released Dynamic Social Graphs

    Active re-identification attacks pose a serious threat to privacy-preserving social graph publication. Active attackers create fake accounts to build structural patterns in social graphs which can be used to re-identify legitimate users on published anonymised graphs, even without additional background knowledge. So far, this type of attacks has only been studied in the scenario where the inherently dynamic social graph is published once. In this paper, we present the first active re-identification attack in the more realistic scenario where a dynamic social graph is periodically published. The new attack leverages tempo-structural patterns for strengthening the adversary. Through a comprehensive set of experiments on real-life and synthetic dynamic social graphs, we show that our new attack substantially outperforms the most effective static active attack in the literature by increasing the success probability of re-identification by more than two times and efficiency by almost 10 times. Moreover, unlike the static attack, our new attack is able to remain at the same level of effectiveness and efficiency as the publication process advances. We conduct a study on the factors that may thwart our new attack, which can help design graph anonymising methods with a better balance between privacy and utility