1,332 research outputs found

    DPT : differentially private trajectory synthesis using hierarchical reference systems

    Get PDF
    GPS-enabled devices are now ubiquitous, from airplanes and cars to smartphones and wearable technology. This has resulted in a wealth of data about the movements of individuals and populations, which can be analyzed for useful information to aid in city and traffic planning, disaster preparedness and so on. However, the places that people go can disclose extremely sensitive information about them, and thus their use needs to be filtered through privacy preserving mechanisms. This turns out to be a highly challenging task: raw trajectories are highly detailed, and typically no pair is alike. Previous attempts fail either to provide adequate privacy protection, or to remain sufficiently faithful to the original behavior. This paper presents DPT, a system to synthesize mobility data based on raw GPS trajectories of individuals while ensuring strong privacy protection in the form of ε-differential privacy. DPT makes a number of novel modeling and algorithmic contributions including (i) discretization of raw trajectories using hierarchical reference systems (at multiple resolutions) to capture individual movements at differing speeds, (ii) adaptive mechanisms to select a small set of reference systems and construct prefix tree counts privately, and (iii) use of direction-weighted sampling for improved utility. While there have been prior attempts to solve the subproblems required to generate synthetic trajectories, to the best of our knowledge, ours is the first system that provides an end-to-end solution. We show the efficacy of our synthetic trajectory generation system using an extensive empirical evaluation

    Privacy, Space and Time: a Survey on Privacy-Preserving Continuous Data Publishing

    Get PDF
    Sensors, portable devices, and location-based services, generate massive amounts of geo-tagged, and/or location- and user-related data on a daily basis. The manipulation of such data is useful in numerous application domains, e.g., healthcare, intelligent buildings, and traffic monitoring, to name a few. A high percentage of these data carry information of users\u27 activities and other personal details, and thus their manipulation and sharing arise concerns about the privacy of the individuals involved. To enable the secure—from the users\u27 privacy perspective—data sharing, researchers have already proposed various seminal techniques for the protection of users\u27 privacy. However, the continuous fashion in which data are generated nowadays, and the high availability of external sources of information, pose more threats and add extra challenges to the problem. In this survey, we visit the works done on data privacy for continuous data publishing, and report on the proposed solutions, with a special focus on solutions concerning location or geo-referenced data

    Privacy in trajectory micro-data publishing : a survey

    Get PDF
    We survey the literature on the privacy of trajectory micro-data, i.e., spatiotemporal information about the mobility of individuals, whose collection is becoming increasingly simple and frequent thanks to emerging information and communication technologies. The focus of our review is on privacy-preserving data publishing (PPDP), i.e., the publication of databases of trajectory micro-data that preserve the privacy of the monitored individuals. We classify and present the literature of attacks against trajectory micro-data, as well as solutions proposed to date for protecting databases from such attacks. This paper serves as an introductory reading on a critical subject in an era of growing awareness about privacy risks connected to digital services, and provides insights into open problems and future directions for research.Comment: Accepted for publication at Transactions for Data Privac

    A survey on privacy in human mobility

    Get PDF
    In the last years we have witnessed a pervasive use of location-aware technologies such as vehicular GPS-enabled devices, RFID based tools, mobile phones, etc which generate collection and storing of a large amount of human mobility data. The powerful of this data has been recognized by both the scientific community and the industrial worlds. Human mobility data can be used for different scopes such as urban traffic management, urban planning, urban pollution estimation, etc. Unfortunately, data describing human mobility is sensitive, because people's whereabouts may allow re-identification of individuals in a de-identified database and the access to the places visited by indi-viduals may enable the inference of sensitive information such as religious belief, sexual preferences, health conditions, and so on. The literature reports many approaches aimed at overcoming privacy issues in mobility data, thus in this survey we discuss the advancements on privacy-preserving mo-bility data publishing. We first describe the adversarial attack and privacy models typically taken into consideration for mobility data, then we present frameworks for the privacy risk assessment and finally, we discuss three main categories of privacy-preserving strategies: methods based on anonymization of mobility data, methods based on the differential privacy models and methods which protect privacy by exploiting generative models for synthetic trajectory generation

    Privacy-Preserving Data Collection and Sharing in Modern Mobile Internet Systems

    Get PDF
    With the ubiquity and widespread use of mobile devices such as laptops, smartphones, smartwatches, and IoT devices, large volumes of user data are generated and recorded. While there is great value in collecting, analyzing and sharing this data for improving products and services, data privacy poses a major concern. This dissertation research addresses the problem of privacy-preserving data collection and sharing in the context of both mobile trajectory data and mobile Internet access data. The first contribution of this dissertation research is the design and development of a system for utility-aware synthesis of differentially private and attack-resilient location traces, called AdaTrace. Given a set of real location traces, AdaTrace executes a four-phase process consisting of feature extraction, synopsis construction, noise injection, and generation of synthetic location traces. Compared to representative prior approaches, the location traces generated by AdaTrace offer up to 3-fold improvement in utility, measured using a variety of utility metrics and datasets, while preserving both differential privacy and attack resilience. The second contribution of this dissertation research is the design and development of locally private protocols for privacy-sensitive collection of mobile and Web user data. Motivated by the excessive utility loss of existing Local Differential Privacy (LDP) protocols under small user populations, this dissertation introduces the notion of Condensed Local Differential Privacy (CLDP) and a suite of protocols satisfying CLDP to enable the collection of various types of user data, ranging from ordinal data types in finite metric spaces (malware infection statistics), to non-ordinal items (OS versions and transaction categories), and to sequences of ordinal or non-ordinal items. Using cybersecurity data and case studies from Symantec, a major cybersecurity vendor, we show that proposed CLDP protocols are practical for key tasks including malware outbreak detection, OS vulnerability analysis, and inspecting suspicious activities on infected machines. The third contribution of this dissertation research is the development of a framework and a prototype system for evaluating privacy-utility tradeoffs of different LDP protocols, called LDPLens. LDPLens introduces metrics to evaluate protocol tradeoffs based on factors such as the utility metric, the data collection scenario, and the user-specified adversary metric. We develop a common Bayesian adversary model to analyze LDP protocols, and we formally and experimentally analyze Adversarial Success Rate (ASR) under each protocol. Motivated by the findings that numerous factors impact the ASR and utility behaviors of LDP protocols, we develop LDPLens to provide effective recommendations for finding the most suitable protocol in a given setting. Our three case studies with real-world datasets demonstrate that using the protocol recommended by LDPLens can offer substantial reduction in utility loss or in ASR, compared to using a randomly chosen protocol.Ph.D

    Privacy-Preserving Release of Spatio-temporal Density

    Get PDF
    International audienceIn today’s digital society, increasing amounts of contextually rich spatio-temporal information are collected and used, e.g., for knowledge-based decision making, research purposes, optimizing operational phases of city management, planning infrastructure networks, or developing timetables for public transportation with an increasingly autonomous vehicle fleet. At the same time, however, publishing or sharing spatio-temporal data, even in aggregated form, is not always viable owing to the danger of violating individuals’ privacy, along with the related legal and ethical repercussions. In this chapter, we review some fundamental approaches for anonymizing and releasing spatio-temporal density, i.e., the number of individuals visiting a given set of locations as a function of time. These approaches follow different privacy models providing different privacy guarantees as well as accuracy of the released anonymized data. We demonstrate some sanitization (anonymization) techniques with provable privacy guarantees by releasing the spatio-temporal density of Paris, in France. We conclude that, in order to achieve meaningful accuracy, the sanitization process has to be carefully customized to the application and public characteristics of the spatio-temporal data

    AutoLog: A Log Sequence Synthesis Framework for Anomaly Detection

    Full text link
    The rapid progress of modern computing systems has led to a growing interest in informative run-time logs. Various log-based anomaly detection techniques have been proposed to ensure software reliability. However, their implementation in the industry has been limited due to the lack of high-quality public log resources as training datasets. While some log datasets are available for anomaly detection, they suffer from limitations in (1) comprehensiveness of log events; (2) scalability over diverse systems; and (3) flexibility of log utility. To address these limitations, we propose AutoLog, the first automated log generation methodology for anomaly detection. AutoLog uses program analysis to generate run-time log sequences without actually running the system. AutoLog starts with probing comprehensive logging statements associated with the call graphs of an application. Then, it constructs execution graphs for each method after pruning the call graphs to find log-related execution paths in a scalable manner. Finally, AutoLog propagates the anomaly label to each acquired execution path based on human knowledge. It generates flexible log sequences by walking along the log execution paths with controllable parameters. Experiments on 50 popular Java projects show that AutoLog acquires significantly more (9x-58x) log events than existing log datasets from the same system, and generates log messages much faster (15x) with a single machine than existing passive data collection approaches. We hope AutoLog can facilitate the benchmarking and adoption of automated log analysis techniques.Comment: The paper has been accepted by ASE 2023 (Research Track

    Molecular assessment of muscle health and function : The effect of age, nutrition and physical activity on the human muscle transcriptome and metabolom

    Get PDF
    Prolonged lifespan and decreased fertility will lead to an increased proportion of older adults in the world population (population aging). An important strategy to deal with population aging has been to promote healthy aging; not only to prevent mounting health care costs, but also to maintain independence and quality of life of older populations for as long as possible. Close to the opposite of the healthy aging is frailty. A major component of (physical) frailty is sarcopenia: age-related loss of muscle mass. Decreased muscle size and strength has been associated with a wide variety of negative health outcomes, including increased risk of hospitalization, physical disability and even death. Therefore, maintaining muscle size and strength is very important for healthy aging. Nutrition and physical activity are possible strategies to maintain or even improve muscle function with age. The effect of nutrition, age, frailty and physical activity on the function of skeletal muscle is complex. A better understanding of the molecular mechanisms involved can provide new insights in potential strategies to maintain muscle function over the life course. This thesis aims to investigate these mechanisms and processes that underlie the effects of age, frailty and physical activity by leveraging the sensitivity and comprehensiveness of transcriptomics and metabolomics. Chapter 2 and 3 describe the effects of age, frailty and resistance-type exercise training on the skeletal muscle transcriptome and metabolome. Both the transcriptome and metabolome show significant differences between frail and healthy older adults. These differences are similar to the differneces between healthy young men and healthy older adults, suggesting that frailty presents itself as a more pronounced form of aging, somewhat independent of chronological age. These age and frailty related differences in the transcriptome are partially reversed by resistance-type exercise training, in accordance with the observed improvement in muscle strength. Regression analysis revealed that the protocadherin gamma gene cluster may be important to skeletal muscle function. Protocadherin gamma is involved in axon guidance and may be upregulated due to the denervation-reinnervation cycles observed in skeletal muscle of older individuals. The metabolome suggested that resistance-type exercise training led to a decrease in branched-chain amino acid oxidation, as shown by a decrease in amino acid derived carnitines. Lastly, the blood metabolome showed little agreement with the metabolome in skeletal muscle, indicating that blood is a poor read-out of muscle metabolism. We assessed the effect of knee immobilization with creatine supplementation or placebo on the skeletal muscle transcriptome and metabolome in chapter 4. Knee immobilization caused muscle mass loss and strength loss in all participants, with no differences between creatine and placebo groups. Knee immobilization appeared to induce the HDAC4-myogenin axis, which is primarily associated with denervation and motor neuron diseases. The metabolome showed changes consistent with the decreased expression of energy metabolism genes. While acyl-carnitine levels tended to decrease with knee immobilization, one branched-chain amino acid-derived acyl carnitine was increased after knee immobilization, suggesting increased amino acid oxidation. Vitamin D deficiency is common among older adults and has been linked to muscle weakness. Vitamin D supplementation has been proposed as a strategy to improve muscle function among older populations. In chapter 5, supplementation with vitamin D (calcifediol, 25(OH)D) is investigated as nutritional strategy to improve muscle function among frail older adults. However, we observed no effect of vitamin D on the muscle transcriptome. These findings indicate the effects of vitamin D supplementation on skeletal muscle may be either absent, weak, or limited to a small subset of muscle cells. Transcriptomic changes due to different forms of muscle disuse are compared in chapter 6 (primarily knee immobilization and bed rest). The goal was to determine the similarities and differences among various causes of muscle atrophy in humans (primarily muscle disuse). Both knee immobilization and bed rest led to significant changes in the muscle transcriptome. However, the overlap in significantly changed genes was relatively small. Knee immobilization was characterized by ubiquitin-mediated proteolysis and induction of the HDAC4/Myogenin axis, whereas bed rest revealed increased expression of genes of the immune system and increased expression of lysosomal genes. Knee immobilization showed the highest similarity with age and frailty-related transcriptomic changes. This finding suggests that knee immobilization may be the most suitable form of disuse atrophy to assess the effectiveness of strategies to prevent age-related muscle loss in humans. The transcriptome and metabolome are incredibly useful tools in describing the wide array of biological systems within skeletal muscle. These systems can be modulated using physical activity (or lack thereof) as well as nutrition. This thesis describes some of these processes and highlights several unexplored genes and metabolites that may be important for maintaining or even optimizing muscle function. In the future, it may be possible to optimize both exercise and nutrition for each individual using these techniques; or even better, cheaper and less invasive alternatives.</p

    Differentially Private Event Stream Filtering with an Application to Traffic Estimation

    Get PDF
    RÉSUMÉ Beaucoup de systèmes à grande échelle tels que les systèmes de transport intelligents, les réseaux intelligents ou les bâtiments intelligents requièrent que des individus contribuent leurs flux de données privées afin d’amasser, stocker, manipuler et analyser les informations pour le traitement du signal et à des fins de prise de décision. Dans un scénario typique, un essaim de capteurs produit des signaux d’entrée à valeurs discrètes décrivant l’occurrence d’événements relatifs à ces individus. En conséquence, des statistiques utiles doivent être publiées continuellement et en temps réel. Cependant, cela peut engendrer une perte de confidentialité pour les utilisateurs. Cette thèse considère le problème de fournir des garanties de confidentialité différentielle pour ces systèmes multi-sorties multi-entrées fonctionnant en continu. En particulier, nous considérons la question de confidentialité dans le contexte de la théorie des systèmes et nous étudions le problème de génération de signaux qui respectent la confidentialité des utilisateurs qui activent les capteurs. Nous présentons une nouvelle architecture d’estimation des flux de trafic préservant la confidentialité des conducteurs. Nous introduisons aussi une surveillance différentiellement confidentielle d’occupation dans un bâtiment équipé d’un dense réseau de capteurs de détection de mouvement, qui sera utile par exemple pour commander le système HVAC.----------ABSTRACT Many large-scale systems such as intelligent transportation systems, smart grids or smart buildings require individuals to contribute their private data streams in order to amass, store, manipulate and analyze information for signal processing and decision-making purposes. In a typical scenario, swarms of sensors produce discrete-valued input signals that describe the occurrence of events involving these users and several statistics of interest need to be continuously published in real-time. This can however engender a privacy loss for the users in exchange of the utility provided by the application. This thesis considers the problem of providing dierential privacy guarantees for such multi-input multi-output systems operating continuously. In particular, we consider the privacy issues in a system theoretic context, and address the problem of releasing filtered signals that respect the privacy of users who activate the sensors. As a result of this thesis we present a new architecture for privacy preserving estimation of trac flows. We also introduce dierentially private monitoring and forecasting occupancy in a building equipped with a dense network of motion detection sensors, which is useful for example to control its HVAC system
    • …
    corecore