463 research outputs found

    Trajectory and Policy Aware Sender Anonymity in Location Based Services

    Full text link
    We consider Location-based Service (LBS) settings, where a LBS provider logs the requests sent by mobile device users over a period of time and later wants to publish/share these logs. Log sharing can be extremely valuable for advertising, data mining research and network management, but it poses a serious threat to the privacy of LBS users. Sender anonymity solutions prevent a malicious attacker from inferring the interests of LBS users by associating them with their service requests after gaining access to the anonymized logs. With the fast-increasing adoption of smartphones and the concern that historic user trajectories are becoming more accessible, it becomes necessary for any sender anonymity solution to protect against attackers that are trajectory-aware (i.e. have access to historic user trajectories) as well as policy-aware (i.e they know the log anonymization policy). We call such attackers TP-aware. This paper introduces a first privacy guarantee against TP-aware attackers, called TP-aware sender k-anonymity. It turns out that there are many possible TP-aware anonymizations for the same LBS log, each with a different utility to the consumer of the anonymized log. The problem of finding the optimal TP-aware anonymization is investigated. We show that trajectory-awareness renders the problem computationally harder than the trajectory-unaware variants found in the literature (NP-complete in the size of the log, versus PTIME). We describe a PTIME l-approximation algorithm for trajectories of length l and empirically show that it scales to large LBS logs (up to 2 million users)

    Generating realistic scaled complex networks

    Get PDF
    Research on generative models is a central project in the emerging field of network science, and it studies how statistical patterns found in real networks could be generated by formal rules. Output from these generative models is then the basis for designing and evaluating computational methods on networks, and for verification and simulation studies. During the last two decades, a variety of models has been proposed with an ultimate goal of achieving comprehensive realism for the generated networks. In this study, we (a) introduce a new generator, termed ReCoN; (b) explore how ReCoN and some existing models can be fitted to an original network to produce a structurally similar replica, (c) use ReCoN to produce networks much larger than the original exemplar, and finally (d) discuss open problems and promising research directions. In a comparative experimental study, we find that ReCoN is often superior to many other state-of-the-art network generation methods. We argue that ReCoN is a scalable and effective tool for modeling a given network while preserving important properties at both micro- and macroscopic scales, and for scaling the exemplar data by orders of magnitude in size.Comment: 26 pages, 13 figures, extended version, a preliminary version of the paper was presented at the 5th International Workshop on Complex Networks and their Application

    Towards trajectory anonymization: a generalization-based approach

    Get PDF
    Trajectory datasets are becoming popular due to the massive usage of GPS and locationbased services. In this paper, we address privacy issues regarding the identification of individuals in static trajectory datasets. We first adopt the notion of k-anonymity to trajectories and propose a novel generalization-based approach for anonymization of trajectories. We further show that releasing anonymized trajectories may still have some privacy leaks. Therefore we propose a randomization based reconstruction algorithm for releasing anonymized trajectory data and also present how the underlying techniques can be adapted to other anonymity standards. The experimental results on real and synthetic trajectory datasets show the effectiveness of the proposed techniques

    A data recipient centered de-identification method to retain statistical attributes

    Get PDF
    AbstractPrivacy has always been a great concern of patients and medical service providers. As a result of the recent advances in information technology and the government’s push for the use of Electronic Health Record (EHR) systems, a large amount of medical data is collected and stored electronically. This data needs to be made available for analysis but at the same time patient privacy has to be protected through de-identification. Although biomedical researchers often describe their research plans when they request anonymized data, most existing anonymization methods do not use this information when de-identifying the data. As a result, the anonymized data may not be useful for the planned research project. This paper proposes a data recipient centered approach to tailor the de-identification method based on input from the recipient of the data. We demonstrate our approach through an anonymization project for biomedical researchers with specific goals to improve the utility of the anonymized data for statistical models used for their research project. The selected algorithm improves a privacy protection method called Condensation by Aggarwal et al. Our methods were tested and validated on real cancer surveillance data provided by the Kentucky Cancer Registry

    You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information

    Get PDF
    Metadata are associated to most of the information we produce in our daily interactions and communication in the digital world. Yet, surprisingly, metadata are often still catergorized as non-sensitive. Indeed, in the past, researchers and practitioners have mainly focused on the problem of the identification of a user from the content of a message. In this paper, we use Twitter as a case study to quantify the uniqueness of the association between metadata and user identity and to understand the effectiveness of potential obfuscation strategies. More specifically, we analyze atomic fields in the metadata and systematically combine them in an effort to classify new tweets as belonging to an account using different machine learning algorithms of increasing complexity. We demonstrate that through the application of a supervised learning algorithm, we are able to identify any user in a group of 10,000 with approximately 96.7% accuracy. Moreover, if we broaden the scope of our search and consider the 10 most likely candidates we increase the accuracy of the model to 99.22%. We also found that data obfuscation is hard and ineffective for this type of data: even after perturbing 60% of the training data, it is still possible to classify users with an accuracy higher than 95%. These results have strong implications in terms of the design of metadata obfuscation strategies, for example for data set release, not only for Twitter, but, more generally, for most social media platforms.Comment: 11 pages, 13 figures. Published in the Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM 2018). June 2018. Stanford, CA, US

    Location cloaking for location privacy protection and location safety protection

    Get PDF
    Many applications today rely on location information, yet disclosing such information can present heightened privacy and safety risks. A person\u27s whereabouts, for example, may reveal sensitive private information such as health condition and lifestyle. Location information also has the potential to allow an adversary to physically locate and destroy a subject, which is particularly concerned in digital battlefields. This research investigates two problems. The first one is location privacy protection in location-based services. Our goal is to provide a desired level of guarantee that the location data collected by the service providers cannot be correlated with restricted spaces such as home and office to derive who\u27s where at what time. We propose 1) leveraging historical location samples for location depersonalization and 2) allowing a user to express her location privacy requirement by identifying a spatial region. With these two ideas in place, we develop a suite of techniques for location-privacy aware uses of location-based services, which can be either sporadic or continuous. An experimental system has been implemented with these techniques. The second problem investigated in this research is location safety protection in ad hoc networks. Unlike location privacy intrusion, the adversary here is not interested in finding the individual identities of the nodes in a spatial region, but simply wants to locate and destroy them. We define the safety level of a spatial region as the inverse of its node density and develop a suite of techniques for location safety-aware cloaking and routing. These schemes allow nodes to disclose their location as accurately as possible, while preventing such information from being used to identify any region with a safety level lower than a required threshold. The performance of the proposed techniques is evaluated through analysis and simulation

    In-Memory Trajectory Indexing for On-The-Fly Travel-Time Estimation

    Get PDF
    corecore