1,945 research outputs found

    Concealment Conserving the Data Mining of Groups & Individual

    Get PDF
    We present an overview of privacy preserving data mining, one of the most popular directions in the data mining research community. In the first part of the chapter, we presented approaches that have been proposed for the protection of either the sensitive data itself in the course of data mining or the sensitive data mining results, in the context of traditional (relational) datasets. Following that, in the second part of the chapter, we focused our attention on one of the most recent as well as prominent directions in privacy preserving data mining: the mining of user mobility data. Although still in its infancy, privacy preserving data mining of mobility data has attracted a lot of research attention and already counts a number of methodologies both with respect to sensitive data protection and to sensitive knowledge hiding. Finally, in the end of the chapter, we provided some roadmap along the field of privacy preserving mobility data mining as well as the area of privacy preserving data mining at large

    When and Where: Predicting Human Movements Based on Social Spatial-Temporal Events

    Full text link
    Predicting both the time and the location of human movements is valuable but challenging for a variety of applications. To address this problem, we propose an approach considering both the periodicity and the sociality of human movements. We first define a new concept, Social Spatial-Temporal Event (SSTE), to represent social interactions among people. For the time prediction, we characterise the temporal dynamics of SSTEs with an ARMA (AutoRegressive Moving Average) model. To dynamically capture the SSTE kinetics, we propose a Kalman Filter based learning algorithm to learn and incrementally update the ARMA model as a new observation becomes available. For the location prediction, we propose a ranking model where the periodicity and the sociality of human movements are simultaneously taken into consideration for improving the prediction accuracy. Extensive experiments conducted on real data sets validate our proposed approach

    Video summarization by group scoring

    Get PDF
    In this paper a new model for user-centered video summarization is presented. Involvement of more than one expert in generating the final video summary should be regarded as the main use case for this algorithm. This approach consists of three major steps. First, the video frames are scored by a group of operators. Next, these assigned scores are averaged to produce a singular value for each frame and lastly, the highest scored video frames alongside the corresponding audio and textual contents are extracted to be inserted into the summary. The effectiveness of this approach has been evaluated by comparing the video summaries generated by this system against the results from a number of automatic summarization tools that use different modalities for abstraction

    Location Privacy in Spatial Crowdsourcing

    Full text link
    Spatial crowdsourcing (SC) is a new platform that engages individuals in collecting and analyzing environmental, social and other spatiotemporal information. With SC, requesters outsource their spatiotemporal tasks to a set of workers, who will perform the tasks by physically traveling to the tasks' locations. This chapter identifies privacy threats toward both workers and requesters during the two main phases of spatial crowdsourcing, tasking and reporting. Tasking is the process of identifying which tasks should be assigned to which workers. This process is handled by a spatial crowdsourcing server (SC-server). The latter phase is reporting, in which workers travel to the tasks' locations, complete the tasks and upload their reports to the SC-server. The challenge is to enable effective and efficient tasking as well as reporting in SC without disclosing the actual locations of workers (at least until they agree to perform a task) and the tasks themselves (at least to workers who are not assigned to those tasks). This chapter aims to provide an overview of the state-of-the-art in protecting users' location privacy in spatial crowdsourcing. We provide a comparative study of a diverse set of solutions in terms of task publishing modes (push vs. pull), problem focuses (tasking and reporting), threats (server, requester and worker), and underlying technical approaches (from pseudonymity, cloaking, and perturbation to exchange-based and encryption-based techniques). The strengths and drawbacks of the techniques are highlighted, leading to a discussion of open problems and future work

    Trajectory-Based Spatiotemporal Entity Linking

    Full text link
    Trajectory-based spatiotemporal entity linking is to match the same moving object in different datasets based on their movement traces. It is a fundamental step to support spatiotemporal data integration and analysis. In this paper, we study the problem of spatiotemporal entity linking using effective and concise signatures extracted from their trajectories. This linking problem is formalized as a k-nearest neighbor (k-NN) query on the signatures. Four representation strategies (sequential, temporal, spatial, and spatiotemporal) and two quantitative criteria (commonality and unicity) are investigated for signature construction. A simple yet effective dimension reduction strategy is developed together with a novel indexing structure called the WR-tree to speed up the search. A number of optimization methods are proposed to improve the accuracy and robustness of the linking. Our extensive experiments on real-world datasets verify the superiority of our approach over the state-of-the-art solutions in terms of both accuracy and efficiency.Comment: 15 pages, 3 figures, 15 table

    Combinatorial Algorithms for String Sanitization

    Get PDF
    String data are often disseminated to support applications such as location-based service provision or DNA sequence analysis. This dissemination, however, may expose sensitive patterns that model confidential knowledge. In this paper, we consider the problem of sanitizing a string by concealing the occurrences of sensitive patterns, while maintaining data utility, in two settings that are relevant to many common string processing tasks. In the first setting, we aim to generate the minimal-length string that preserves the order of appearance and frequency of all non-sensitive patterns. Such a string allows accurately performing tasks based on the sequential nature and pattern frequencies of the string. To construct such a string, we propose a time-optimal algorithm, TFS-ALGO. We also propose another time-optimal algorithm, PFS-ALGO, which preserves a partial order of appearance of non-sensitive patterns but produces a much shorter string that can be analyzed more efficiently. The strings produced by either of these algorithms are constructed by concatenating non-sensitive parts of the input string. However, it is possible to detect the sensitive patterns by ``reversing'' the concatenation operations. In response, we propose a heuristic, MCSR-ALGO, which replaces letters in the strings output by the algorithms with carefully selected letters, so that sensitive patterns are not reinstated, implausible patterns are not introduced, and occurrences of spurious patterns are prevented. In the second setting, we aim to generate a string that is at minimal edit distance from the original string, in addition to preserving the order of appearance and frequency of all non-sensitive patterns. To construct such a string, we propose an algorithm, ETFS-ALGO, based on solving specific instances of approximate regular expression matching.Comment: Extended version of a paper accepted to ECML/PKDD 201

    Local Suppression and Splitting Techniques for Privacy Preserving Publication of Trajectories

    Get PDF
    postprin

    Chronic infection: punctuated interpenetration and pathogen virulence

    Get PDF
    We apply an information dynamics formalism to the Levens and Lewontin vision of biological interpenetration between a 'cognitive condensation' including immune function embedded in social and cultural structure on the one hand, and an established, highly adaptive, parasite population on the other. We iterate the argument, beginning with direct interaction between cognitive condensation and pathogen, then extend the analysis to second order 'mutator' mechanisms inherent both to immune function and to certain forms of rapid pathogen antigenic variability. The methodology, based on the Large Deviations Program of applied probability, produces synergistic cognitive/adaptive 'learning plateaus' that represent stages of chronic infection, and, for human populations, is able to encompass the fundamental biological reality of culture omitted by other approaches. We conclude that, for 'evolution machine' pathogens like HIV and malaria, simplistic magic bullet 'medical' drug, vaccine, or behavior modification interventions which do not address the critical context of overall living and working conditions may constitute selection pressures triggering adaptations in life history strategy resulting in marked increase of pathogen virulenc
    • …
    corecore