2,539 research outputs found

    PRUDEnce: A system for assessing privacy risk vs utility in data sharing ecosystems

    Get PDF
    Data describing human activities are an important source of knowledge useful for understanding individual and collective behavior and for developing a wide range of user services. Unfortunately, this kind of data is sensitive, because people’s whereabouts may allow re-identification of individuals in a de-identified database. Therefore, Data Providers, before sharing those data, must apply any sort of anonymization to lower the privacy risks, but they must be aware and capable of controlling also the data quality, since these two factors are often a trade-off. In this paper we propose PRUDEnce (Privacy Risk versus Utility in Data sharing Ecosystems), a system enabling a privacy-aware ecosystem for sharing personal data. It is based on a methodology for assessing both the empirical (not theoretical) privacy risk associated to users represented in the data, and the data quality guaranteed only with users not at risk. Our proposal is able to support the Data Provider in the exploration of a repertoire of possible data transformations with the aim of selecting one specific transformation that yields an adequate trade-off between data quality and privacy risk. We study the practical effectiveness of our proposal over three data formats underlying many services, defined on real mobility data, i.e., presence data, trajectory data and road segment data

    Privacy risks of spatio-temporal data transformations

    Get PDF
    In recent years, we witness a great leap in data collection thanks to increasing number of mobile devices. Millions of mobile devices including smart phones, tablets and even wearable gadgets embedded with GPS hardware enable tagging data with location. New generation applications rely heavily on location information for innovative business intelligence which may require data to be shared with third parties for analytics. However, location data is considered to be highly sensitive and its processing is regulated especially in Europe where strong data protection practices are enforced. To preserve privacy of individuals, first precaution is to remove personal identifiers such as name and social security number which was shown to be problematic due to possible linking with public data sources. In fact, location itself may be an identifier, for example the locations in the evening may hint the home address which may be linked to the individual. Since location cannot be shared as it is, data transformation techniques have been developed with the aim of preventing user re-identification. Data transformation techniques transform data points from their initial domain into a new domain while preserving certain statistical properties of data. In this thesis, we show that distance-preserving data transformations may not fully preserve privacy in the sense that location information may be estimated from the transformed data when the attacker utilizes information such as public domain knowledge and known samples. We present attack techniques based on adversaries with various background information. We first focus on spatio-temporal trajectories and propose an attack that can reconstruct a target trajectory using a few known samples from the dataset. We show that it is possible to create many similar trajectories that mimic the target trajectory according to the knowledge (i.e. number of known samples). The attack can identify locations visited or not visited by the trajectory with high confidence. Next, we consider relation-preserving transformations and develop a novel attack technique on transformation of sole location points even when only approximate or noisy distances are present. We experimentally demonstrate that an attacker with a limited background information from the dataset is still able to identify small regions that include the target location points

    SoK: differentially private publication of trajectory data

    Get PDF
    Trajectory analysis holds many promises, from improvements in traffic management to routing advice or infrastructure development. However, learning users’ paths is extremely privacy-invasive. Therefore, there is a necessity to protect trajectories such that we preserve the global properties, useful for analysis, while specific and private information of individuals remains inaccessible. Trajectories, however, are difficult to protect, since they are sequential, highly dimensional, correlated, bound to geophysical restrictions, and easily mapped to semantic points of interest. This paper aims to establish a systematic framework on protective masking measures for trajectory databases with differentially private (DP) guarantees, including also utility properties, derived from ideas and limitations of existing proposals. To reach this goal, we systematize the utility metrics used throughout the literature, deeply analyze the DP granularity notions, explore and elaborate on the state of the art on privacy-enhancing mechanisms and their problems, and expose the main limitations of DP notions in the context of trajectories.We would like to thank the reviewers and shepherd for their useful comments and suggestions in the improvement of this paper. Javier Parra-Arnau is the recipient of a “Ramón y Cajal” fellowship funded by the Spanish Ministry of Science and Innovation. This work also received support from “la Caixa” Foundation (fellowship code LCF/BQ/PR20/11770009), the European Union’s H2020 program (Marie SkƂodowska-Curie grant agreement № 847648) from the Government of Spain under the project “COMPROMISE” (PID2020-113795RB-C31/AEI/10.13039/501100011033), and from the BMBF project “PROPOLIS” (16KIS1393K). The authors at KIT are supported by KASTEL Security Research Labs (Topic 46.23 of the Helmholtz Association) and Germany’s Excellence Strategy (EXC 2050/1 ‘CeTI’; ID 390696704).Peer ReviewedPostprint (published version

    SoK: Differentially Private Publication of Trajectory Data

    Get PDF
    Trajectory analysis holds many promises, from improvements in traffic management to routing advice or infrastructure development. However, learning users\u27 paths is extremely privacy-invasive. Therefore, there is a necessity to protect trajectories such that we preserve the global properties, useful for analysis, while specific and private information of individuals remains inaccessible. Trajectories, however, are difficult to protect, since they are sequential, highly dimensional, correlated, bound to geophysical restrictions, and easily mapped to semantic points of interest. This paper aims to establish a systematic framework on protective masking and synthetic-generation measures for trajectory databases with syntactic and differentially private (DP) guarantees, including also utility properties, derived from ideas and limitations of existing proposals. To reach this goal, we systematize the utility metrics used throughout the literature, deeply analyze the DP granularity notions, explore and elaborate on the state of the art on privacy-enhancing mechanisms and their problems, and expose the main limitations of DP notions in the context of trajectories

    Privacy in trajectory micro-data publishing : a survey

    Get PDF
    We survey the literature on the privacy of trajectory micro-data, i.e., spatiotemporal information about the mobility of individuals, whose collection is becoming increasingly simple and frequent thanks to emerging information and communication technologies. The focus of our review is on privacy-preserving data publishing (PPDP), i.e., the publication of databases of trajectory micro-data that preserve the privacy of the monitored individuals. We classify and present the literature of attacks against trajectory micro-data, as well as solutions proposed to date for protecting databases from such attacks. This paper serves as an introductory reading on a critical subject in an era of growing awareness about privacy risks connected to digital services, and provides insights into open problems and future directions for research.Comment: Accepted for publication at Transactions for Data Privac

    Essays on exploitation and exploration in software development

    Get PDF
    Software development includes two types of activities: software improvement activities by correcting faults and software enhancement activities by adding new features. Based on organizational theory, we propose that these activities can be classified as implementation-oriented (exploitation) and innovation-oriented (exploration). In the context of open source software (OSS) development, developing a patch would be an example of an exploitation activity. Requesting a new software feature would be an example of an exploration activity. This dissertation consists of three essays which examine exploitation and exploration in software development. The first essay analyzes software patch development (exploitation) in the context of software vulnerabilities which could be exploited by hackers. There is a need for software vendors to make software patches available in a timely manner for vulnerabilities in their products. We develop a survival analysis model of the patch release behavior of software vendors based on a cost-based framework of software vendors. We test this model using a data set compiled from the National Vulnerability Database (NVD), United States Computer Emergency Readiness Team (US-CERT), and vendor web sites. Our results indicate that vulnerabilities with high confidentiality impact or high integrity impact are patched faster than vulnerabilities with high availability impact. Interesting differences in the patch release behavior of software vendors based on software type (new release vs. update) and type of vendor (open source vs. proprietary) are found. The second essay studies exploitation and exploration in the content of OSS development. We empirically examine the differences between exploitation (patch development) and exploration (feature request) networks of developers in OSS projects in terms of their social network structure, using a data set collected from the SourceForge database. We identify a new category of developers (ambidextrous developers) in OSS projects who contribute to patch development as well as feature request activities. Our results indicate that a patch development network has greater internal cohesion and network centrality than a feature request network. In contrast, a feature request network has greater external connectivity than a patch development network. The third essay explores ambidexterity and ambidextrous developers in the context of OSS project performance. Recent research on OSS development has studied the social network structure of software developers as a determinant of project success. However, this stream of research has focused on the project level, and has not recognized the fact that software projects could consist of different types of activities, each of which could require different types of expertise and network structures. We develop a theoretical construct for ambidexterity based on the concept of ambidextrous developers. We empirically illustrate the effects of ambidexterity and network characteristics on OSS project performance. Our results indicate that a moderate level of ambidexterity, external cohesion, and technological diversity are desirable for project success. Project success is also positively related to internal cohesion and network centrality. We illustrate the roles of ambidextrous developers on project performance and their differences compared to other developers
    • 

    corecore