39 research outputs found

    Location Anonymization With Considering Errors and Existence Probability

    Get PDF
    Mobile devices that can sense their location using GPS or Wi-Fi have become extremely popular. However, many users hesitate to provide their accurate location information to unreliable third parties if it means that their identities or sensitive attribute values will be disclosed by doing so. Many approaches for anonymization, such as k-anonymity, have been proposed to tackle this issue. Existing studies for k-anonymity usually anonymize each user\u27s location so that the anonymized area contains k or more users. Existing studies, however, do not consider location errors and the probability that each user actually exists at the anonymized area. As a result, a specific user might be identified by untrusted third parties. We propose novel privacy and utility metrics that can treat the location and an efficient algorithm to anonymize the information associated with users\u27 locations. This is the first work that anonymizes location while considering location errors and the probability that each user is actually present at the anonymized area. By means of simulations, we have proven that our proposed method can reduce the risk of the user\u27s attributes being identified while maintaining the utility of the anonymized data

    Παραδοτέο Π.1.2: Τεχνικές ανάκτησης πληροφορίας από πηγές μη-παραδοσιακών δεδομένων

    Get PDF
    Το παρόν παραδοτέο (Π.1.2) περιλαμβάνει τα αποτελέσματα της Υποδράσης ΥΔ1.2, που αφορά την ανάπτυξη τεχνικών ανάκτησης πληροφορίας από πηγές μη-παραδοσιακών δεδομένων. Οι προκλήσεις που ανακύπτουν στην υποδράση αυτή περιλαμβάνουν το χειρισμό της ετερογένειας των δεδομένων σε όλες τις μορφές της (συντακτική, δομική, σημασιολογική) για την αποδοτική ανάκτηση της πληροφορίας και την εισαγωγή και ολοκλήρωσή της στο οικοσύστημα δεδομένων ώστε να υποστηρίζεται αποδοτική και πλούσια σε σημασιολογία αναζήτηση. Στην Ενότητα 1 παρουσιάζουμε το γενικότερο πλαίσιο του προβλήματος. Στην Ενότητα 2 περιγράφουμε μηχανισμούς εμπλουτισμού δεδομένων τροχιών κινούμενων αντικειμένων χρησιμοποιώντας οντολογίες και διασυνδεδεμένα δεδομένα με σαφώς καθορισμένη και ευρέως αποδεκτή σημασιολογία, που είναι ήδη διαθέσιμα στο διαδίκτυο. Στην Ενότητα 3 αναπτύσσουμε τεχνικές ανάκτησης πληροφοριών από πηγές πολυδιάστατων δεδομένων που διαφυλάττουν την σημασιολογία των δεδομένων και τις μεταξύ τους συσχετίσεις αλλά και προστατεύουν την ιδιωτικότητα των εμπλεκομένων. Στην Ενότητα 4 αναπτύσσουμε τεχνικές ανάκτησης πληροφοριών που αποκαλύπτουν ιδιότητες και χαρακτηριστικά στη δομή και στη σημασιολογία ιατρικών δεδομένων και γράφων. Ανακεφαλαιώνουμε τα αποτελέσματά μας στην Ενότητα 5

    Local Suppression and Splitting Techniques for Privacy Preserving Publication of Trajectories

    Get PDF
    postprin

    Apriori-based algorithms for k^m-anonymizing trajectory data

    Get PDF
    The proliferation of GPS-enabled devices (e.g., smartphones and tablets) and locationbased social networks has resulted in the abundance of trajectory data. The publication of such data opens up new directions in analyzing, studying and understanding human behavior. However, it should be performed in a privacy-preserving way, because the identities of individuals, whose movement is recorded in trajectories, can be disclosed even after removing identifying information. Existing trajectory data anonymization approaches offer privacy but at a high data utility cost, since they either do not produce truthful data (an important requirement of several applications), or are limited in their privacy specification component. In this work, we propose a novel approach that overcomes these shortcomings by adapting km-anonymity to trajectory data. To realize our approach, we develop three efficient and effective anonymization algorithms that are based on the apriori principle. These algorithms aim at preserving different data characteristics, including location distance and semantic similarity, as well as user-specified utility requirements, which must be satisfied to ensure that the released data can be meaningfully analyzed. Our extensive experiments using synthetic and real datasets verify that the proposed algorithms are efficient and effective at preserving data utility

    Privacy preservation in social media environments using big data

    Get PDF
    With the pervasive use of mobile devices, social media, home assistants, and smart devices, the idea of individual privacy is fading. More than ever, the public is giving up personal information in order to take advantage of what is now considered every day conveniences and ignoring the consequences. Even seemingly harmless information is making headlines for its unauthorized use (18). Among this data is user trajectory data which can be described as a user\u27s location information over a time period (6). This data is generated whenever users access their devices to record their location, query the location of a point of interest, query directions to get to a location, request services to come to their location, and many other applications. This data could be used by a malicious adversary to track a user\u27s movements, location, daily patterns, and learn details personal to the user. While the best course of action would be to hide this information entirely, this data can be used for many beneficial purposes as well. Emergency vehicles could be more efficiently routed based on trajectory patterns, businesses could make intelligent marketing or building decisions, and users themselves could benefit by taking advantage of more conveniences. There are several challenges to publishing this data while also preserving user privacy. For example, while location data has good utility, users expect their data to be private. For real world applications, users generate many terabytes of data every day. To process this volume of data for later use and anonymize it in order to hide individual user identities, this thesis presents an efficient algorithm to change the processing time for anonymization from days, as seen in (20), to a matter of minutes or hours. We cannot focus just on location data, however. Social media has a great many uses, one of which being the sharing of images. Privacy cannot stop with location, but must reach to other data as well. This thesis addresses the issue of image privacy in this work, as often images can be even more sensitive than location --Abstract, page iv

    Παραδοτέο Π.1.1: Μοντελοποίηση μη-παραδοσιακών δεδομένων

    Get PDF
    Το παρόν Παραδοτέο Π.1.1 περιλαμβάνει τα αποτελέσματα της υποδράσης ΥΔ1.1: Μοντελοποίηση μη-παραδοσιακών δεδομένων. Στην ενότητα 1 παρουσιάζεται το γενικότερο πλαίσιο του προβλήματος και η επισκόπιση των περιοχών που μελετήθηκαν. Στην ενότητα 2 αναλύονται οι ανάγκες μοντελοποίησης διαφορετικών πεδίων με έμφαση σε ευαίσθητα δεδομένα και τις μεθόδους ανωνυμίας που αναπτύχθηκαν. Στην ενότητα 3 η έμφαση δίνεται σε χωρο χρονικά δεδομένα τόσο με σκοπό τη διαχείριση ασάφειας όσο και με στόχο την παροχή διασύνδεσης διαφορετικών πηγών και σημασιολογικού εμπλουτισμού αυτών

    A Survey and Experimental Study on Privacy-Preserving Trajectory Data Publishing

    Get PDF
    Trajectory data has become ubiquitous nowadays, which can benefit various real-world applications such as traffic management and location-based services. However, trajectories may disclose highly sensitive information of an individual including mobility patterns, personal profiles and gazetteers, social relationships, etc, making it indispensable to consider privacy protection when releasing trajectory data. Ensuring privacy on trajectories demands more than hiding single locations, since trajectories are intrinsically sparse and high-dimensional, and require to protect multi-scale correlations. To this end, extensive research has been conducted to design effective techniques for privacy-preserving trajectory data publishing. Furthermore, protecting privacy requires carefully balance two metrics: privacy and utility. In other words, it needs to protect as much privacy as possible and meanwhile guarantee the usefulness of the released trajectories for data analysis. In this survey, we provide a comprehensive study and a systematic summarization of existing protection models, privacy and utility metrics for trajectories developed in the literature. We also conduct extensive experiments on two real-life public trajectory datasets to evaluate the performance of several representative privacy protection models, demonstrate the trade-off between privacy and utility, and guide the choice of the right privacy model for trajectory publishing given certain privacy and utility desiderata

    DPT : differentially private trajectory synthesis using hierarchical reference systems

    Get PDF
    GPS-enabled devices are now ubiquitous, from airplanes and cars to smartphones and wearable technology. This has resulted in a wealth of data about the movements of individuals and populations, which can be analyzed for useful information to aid in city and traffic planning, disaster preparedness and so on. However, the places that people go can disclose extremely sensitive information about them, and thus their use needs to be filtered through privacy preserving mechanisms. This turns out to be a highly challenging task: raw trajectories are highly detailed, and typically no pair is alike. Previous attempts fail either to provide adequate privacy protection, or to remain sufficiently faithful to the original behavior. This paper presents DPT, a system to synthesize mobility data based on raw GPS trajectories of individuals while ensuring strong privacy protection in the form of ε-differential privacy. DPT makes a number of novel modeling and algorithmic contributions including (i) discretization of raw trajectories using hierarchical reference systems (at multiple resolutions) to capture individual movements at differing speeds, (ii) adaptive mechanisms to select a small set of reference systems and construct prefix tree counts privately, and (iii) use of direction-weighted sampling for improved utility. While there have been prior attempts to solve the subproblems required to generate synthetic trajectories, to the best of our knowledge, ours is the first system that provides an end-to-end solution. We show the efficacy of our synthetic trajectory generation system using an extensive empirical evaluation

    Probabilistic km^m-anonymity: Efficient Anonymization of Large Set-Valued Datasets

    Get PDF
    International audienceSet-valued dataset contains different types of items/values per individual, for example, visited locations, purchased goods, watched movies, or search queries.As it is relatively easy to re-identify individuals in such datasets, their release poses significant privacy threats.Hence, organizations aiming to share such datasets must adhere to personal data regulations.In order to get rid of these regulations and also to benefit from sharing, these datasets should be anonymized before their release.In this paper, we revisit the problem of anonymizing set-valued data. We argue that anonymization techniques targeting traditional \emph{k\textsuperscript{m}}-anonymity model, which limits the adversarial background knowledge to at most \emph{m} items per individual, are impractical for large real-world datasets.Hence, we propose a probabilistic relaxation of \emph{k\textsuperscript{m}}-anonymity and present an anonymization technique to achieve it.This relaxation also improves the utility of the anonymized data.We also demonstrate the effectiveness of our scalable anonymization technique on a real-world location dataset consisting of more than 4 million subscribers of a large European telecom operator.We believe that our technique can be very appealing for practitioners willing to share such large datasets
    corecore