34 research outputs found

    Adaptive Voronoi Masking: A Method to Protect Confidential Discrete Spatial Data

    Get PDF

    Where you go is who you are -- A study on machine learning based semantic privacy attacks

    Full text link
    Concerns about data privacy are omnipresent, given the increasing usage of digital applications and their underlying business model that includes selling user data. Location data is particularly sensitive since they allow us to infer activity patterns and interests of users, e.g., by categorizing visited locations based on nearby points of interest (POI). On top of that, machine learning methods provide new powerful tools to interpret big data. In light of these considerations, we raise the following question: What is the actual risk that realistic, machine learning based privacy attacks can obtain meaningful semantic information from raw location data, subject to inaccuracies in the data? In response, we present a systematic analysis of two attack scenarios, namely location categorization and user profiling. Experiments on the Foursquare dataset and tracking data demonstrate the potential for abuse of high-quality spatial information, leading to a significant privacy loss even with location inaccuracy of up to 200m. With location obfuscation of more than 1 km, spatial information hardly adds any value, but a high privacy risk solely from temporal information remains. The availability of public context data such as POIs plays a key role in inference based on spatial information. Our findings point out the risks of ever-growing databases of tracking data and spatial context data, which policymakers should consider for privacy regulations, and which could guide individuals in their personal location protection measures

    Computers, Environment and Urban Systems / Adaptive areal elimination (AAE) : a transparent way of disclosing protected spatial datasets

    Get PDF
    Geographical masking is the conventional solution to protect the privacy of individuals involved in confidential spatial point datasets. The masking process displaces confidential locations to protect individual privacy while maintaining a fine level of spatial resolution. The adaptive form of this process aims to further minimize the displacement error by taking into account the underlying population density. We describe an alternative adaptive geomasking method, referred to as Adaptive Areal Elimination (AAE). AAE creates areas of a minimum K-anonymity and then original points are either randomly perturbed within the areas or aggregated to the median centers of the areas. In addition to the masked points, K-anonymized areas can be safely disclosed as well without increasing the risk of re-identification. Using a burglary dataset from Vienna, AAE is compared with an existing adaptive geographical mask, the donut mask. The masking methods are evaluated for preserving a predefined K-anonymity and the spatial characteristics of the original points. The spatial characteristics are assessed with four measures of spatial error: displaced distance, correlation coefficient of density surfaces, hotspots' divergence, and clusters' specificity. Masked points from point aggregation of AAE have the highest spatial error in all the measures but the displaced distance. In contrast, masked points from the donut mask are displaced the least, preserve the original spatial clusters better, have the highest clusters' specificity and correlation coefficient of density surfaces. However, when the donut mask is adapted to achieve an actual K-anonymity, the random perturbation of AAE introduces less spatial error than the donut mask for all the measures of spatial error.(VLID)231721

    Privacy Threats and Protection Recommendations for the Use of Geosocial Network Data in Research

    No full text
    Inference attacks and protection measures are two sides of the same coin. Although the former aims to reveal information while the latter aims to hide it, they both increase awareness regarding the risks and threats from social media apps. On the one hand, inference attack studies explore the types of personal information that can be revealed and the methods used to extract it. An additional risk is that geosocial media data are collected massively for research purposes, and the processing or publication of these data may further compromise individual privacy. On the other hand, consistent and increasing research on location protection measures promises solutions that mitigate disclosure risks. In this paper, we examine recent research efforts on the spectrum of privacy issues related to geosocial network data and identify the contributions and limitations of these research efforts. Furthermore, we provide protection recommendations to researchers that share, anonymise, and store social media data or publish scientific results

    Defining a Threshold Value for Maximum Spatial Information Loss of Masked Geo-Data

    No full text
    Geographical masks are a group of location protection methods for the dissemination and publication of confidential and sensitive information, such as health- and crime-related geo-referenced data. The use of such masks ensures that privacy is protected for the individuals involved in the datasets. Nevertheless, the protection process introduces spatial error to the masked dataset. This study quantifies the spatial error of masked datasets using two approaches. First, a perceptual survey was employed where participants ranked the similarity of a diverse sample of masked and original maps. Second, a spatial statistical analysis was performed that provided quantitative results for the same pairs of maps. Spatial statistical similarity is calculated with three divergence indices that employ different spatial clustering methods. All indices are significantly correlated with the perceptual similarity. Finally, the results of the spatial analysis are used as the explanatory variable to estimate the perceptual similarity. Three prediction models are created that indicate upper boundaries for the spatial statistical results upon which the masked data are perceived differently from the original data. The results of the study aim to help potential “maskers” to quantify and evaluate the error of confidential masked visualizations

    Towards geoprivacy guidelines for spatial data

    No full text
    This paper proposes an approach towards practical privacy guidelines for the different stages of a research effort that collects and/or uses “sensitive” spatial data. Specifically, we focus on: a) initial tasks as prior to starting a survey, b) storing, anonymization, and assessment of datasets, and c) actions to eliminate disclosure from published data and deliverables or when datasets are shared with third parties

    ISPRS International Journal of Geo-Information / Defining a threshold value for maximum spatial information loss of masked geo-data

    No full text
    Geographical masks are a group of location protection methods for the dissemination and publication of confidential and sensitive information, such as health- and crime-related geo-referenced data. The use of such masks ensures that privacy is protected for the individuals involved in the datasets. Nevertheless, the protection process introduces spatial error to the masked dataset. This study quantifies the spatial error of masked datasets using two approaches. First, a perceptual survey was employed where participants ranked the similarity of a diverse sample of masked and original maps. Second, a spatial statistical analysis was performed that provided quantitative results for the same pairs of maps. Spatial statistical similarity is calculated with three divergence indices that employ different spatial clustering methods. All indices are significantly correlated with the perceptual similarity. Finally, the results of the spatial analysis are used as the explanatory variable to estimate the perceptual similarity. Three prediction models are created that indicate upper boundaries for the spatial statistical results upon which the masked data are perceived differently from the original data. The results of the study aim to help potential “maskers” to quantify and evaluate the error of confidential masked visualizations.(VLID)219022

    Transactions in GIS / Spatial information divergence : Using Global and Local Indices to compare geographical masks applied to crime data

    No full text
    Advances in Geographic Information Science (GISc) and the increasing availability of location data have facilitated the dissemination of crime data and the abundance of crime mapping websites. However, data holders acknowledge that when releasing sensitive crime data there is a risk of compromising the victims' privacy. Hence, protection methodologies are primarily applied to the data to ensure that individual privacy is not violated. This article addresses one group of location protection methodologies, namely geographical masks that are applicable for crime data representations. The purpose is to identify which mask is the most appropriate for crime incident visualizations. A global divergence index (GDi) and a local divergence index (LDi) are developed to compare the effects that these masks have on the original crime point pattern. The indices calculate how dissimilar the spatial information of the masked data is from the spatial information of the original data in regards to the information obtained via spatial crime analysis. The results of the analysis show that the variable radius mask and the donut geomask should be primarily used for crime representations as they produce less spatial information divergence of the original crime point pattern than the alternative local random rotation mask and circular mask.(VLID)221518

    Privacy Threats and Protection Recommendations for the Use of Geosocial Network Data in Research

    Get PDF
    Inference attacks and protection measures are two sides of the same coin. Although the former aims to reveal information while the latter aims to hide it, they both increase awareness regarding the risks and threats from social media apps. On the one hand, inference attack studies explore the types of personal information that can be revealed and the methods used to extract it. An additional risk is that geosocial media data are collected massively for research purposes, and the processing or publication of these data may further compromise individual privacy. On the other hand, consistent and increasing research on location protection measures promises solutions that mitigate disclosure risks. In this paper, we examine recent research efforts on the spectrum of privacy issues related to geosocial network data and identify the contributions and limitations of these research efforts. Furthermore, we provide protection recommendations to researchers that share, anonymise, and store social media data or publish scientific results

    Geosocial Media Data as Predictors in a GWR Application to Forecast Crime Hotspots (Short Paper)

    Get PDF
    In this paper we forecast hotspots of street crime in Portland, Oregon. Our approach uses geosocial media posts, which define the predictors in geographically weighted regression (GWR) models. We use two predictors that are both derived from Twitter data. The first one is the population at risk of being victim of street crime. The second one is the crime related tweets. These two predictors were used in GWR to create models that depict future street crime hotspots. The predicted hotspots enclosed more than 23% of the future street crimes in 1% of the study area and also outperformed the prediction efficiency of a baseline approach. Future work will focus on optimizing the prediction parameters and testing the applicability of this approach to other mobile crime types
    corecore