262 research outputs found

    Privacy Preservation by Disassociation

    Full text link
    In this work, we focus on protection against identity disclosure in the publication of sparse multidimensional data. Existing multidimensional anonymization techniquesa) protect the privacy of users either by altering the set of quasi-identifiers of the original data (e.g., by generalization or suppression) or by adding noise (e.g., using differential privacy) and/or (b) assume a clear distinction between sensitive and non-sensitive information and sever the possible linkage. In many real world applications the above techniques are not applicable. For instance, consider web search query logs. Suppressing or generalizing anonymization methods would remove the most valuable information in the dataset: the original query terms. Additionally, web search query logs contain millions of query terms which cannot be categorized as sensitive or non-sensitive since a term may be sensitive for a user and non-sensitive for another. Motivated by this observation, we propose an anonymization technique termed disassociation that preserves the original terms but hides the fact that two or more different terms appear in the same record. We protect the users' privacy by disassociating record terms that participate in identifying combinations. This way the adversary cannot associate with high probability a record with a rare combination of terms. To the best of our knowledge, our proposal is the first to employ such a technique to provide protection against identity disclosure. We propose an anonymization algorithm based on our approach and evaluate its performance on real and synthetic datasets, comparing it against other state-of-the-art methods based on generalization and differential privacy.Comment: VLDB201

    Supporting agricultural communities with workflows on heterogeneous computing resources

    Get PDF

    Features of recording practices and communication during nursing handover: a cluster analysis

    Get PDF
    Objective: To record and identify the characteristics of nursing handovers in a tertiary hospital. Method: Observational study. Twenty-two nurses participated in 11 nursing handovers in 2015/16, using a recorded audio system and an unstructured observation form. Hierarchical cluster analysis was performed. Results: Thirty characteristics were identified. The nursing handovers were based on the clinical status of patients, and all nurses obtained specialized scientific knowledge specific to the clinical environment. The information used was not based on nursing diagnoses and not in accordance with best nursing clinical practice. The following four clusters emerged among the 30 characteristics: 1) the use of evidence-based nursing practice, 2) the nonuse of evidencebased nursing practice and its correlation with strained psychological environment, 3) patient management and the clinical skills/knowledge of nurses, and 4) handover content, quality of information transferred and specialization. Conclusion: Multiple characteristics were observed. The majority of characteristics were grouped based on common features, and 4 main clusters emerged. The investigation and understanding of structural relations between these characteristics and their respective clusters may lead to an improvement in the quality of nursing health care services

    In-Memory Interval Joins

    Get PDF
    The interval join is a popular operation in temporal, spatial, and uncertain databases. The majority of interval join algorithms assume that input data reside on disk and so, their focus is to minimize the I/O accesses. Recently, an in-memory approach based on plane sweep (PS) for modern hardware was proposed which greatly outperforms previous work. However, this approach relies on a complex data structure and its parallelization has not been adequately studied. In this article, we investigate in-memory interval joins in two directions. First, we explore the applicability of a largely ignored forward scan (FS)-based plane sweep algorithm, for single-threaded join evaluation. We propose four optimizations for FS that greatly reduce its cost, making it competitive or even faster than the state-of-the-art. Second, we study in depth the parallel computation of interval joins. We design a non-partitioning-based approach that determines independent tasks of the join algorithm to run in parallel. Then, we address the drawbacks of the previously proposed hash-based partitioning and suggest a domain-based partitioning approach that does not produce duplicate results. Within our approach, we propose a novel breakdown of the partition-joins into mini-joins to be scheduled in the available CPU threads and propose an adaptive domain partitioning, aiming at load balancing. We also investigate how the partitioning phase can benefit from modern parallel hardware. Our thorough experimental analysis demonstrates the advantage of our novel partitioning-based approach for parallel computation

    Parallel In-Memory Evaluation of Spatial Joins

    Full text link
    The spatial join is a popular operation in spatial database systems and its evaluation is a well-studied problem. As main memories become bigger and faster and commodity hardware supports parallel processing, there is a need to revamp classic join algorithms which have been designed for I/O-bound processing. In view of this, we study the in-memory and parallel evaluation of spatial joins, by re-designing a classic partitioning-based algorithm to consider alternative approaches for space partitioning. Our study shows that, compared to a straightforward implementation of the algorithm, our tuning can improve performance significantly. We also show how to select appropriate partitioning parameters based on data statistics, in order to tune the algorithm for the given join inputs. Our parallel implementation scales gracefully with the number of threads reducing the cost of the join to at most one second even for join inputs with tens of millions of rectangles.Comment: Extended version of the SIGSPATIAL'19 paper under the same titl

    41P. Practical Lessons Learned while Developing Web 2.0 Collaboration Services for Communities of Practice

    Get PDF
    Although a plethora of Web 2.0 applications exist today, there is little literature reporting on experiences, concrete recommendations or best practices when developing such applications. The scarcity of such records makes it difficult for developers to determine how best to support the practices of communities with the use of Web 2.0 technologies. In this paper, we report on eight practical lessons learned while developing Web 2.0 collaboration services for Communities of Practice in the framework of a three years long European research project on Technology Enhanced Learning. The main objective of this project was to investigate how Web 2.0 technologies could impact the communication and collaboration needs of Communities of Practice interacting online and, conversely, how new interaction needs could impact Web 2.0 technology. The above lessons are presented in a way that could aid people engaged in various phases of the development of Web-based collaboration support services

    Web Observatories: Concepts, State Of The Art & Beyond

    Get PDF
    The Web Observatories are becoming common Internet practice. They are web sites targeting a community of practitioners, scientists or generally individuals within the context of a focused organization. Their goal is to inform, educate, facilitate the interaction and boost the collaboration of community members. Various existing technologies can be deployed for this purpose. Still, their integration into a coherent informational and collaborative environment remains largely ad hoc. In this paper we attempt to elucidate the concept web observatory and identify its characteristics and practices

    Two-layer Space-oriented Partitioning for Non-point Data

    Full text link
    Non-point spatial objects (e.g., polygons, linestrings, etc.) are ubiquitous. We study the problem of indexing non-point objects in memory for range queries and spatial intersection joins. We propose a secondary partitioning technique for space-oriented partitioning indices (e.g., grids), which improves their performance significantly, by avoiding the generation and elimination of duplicate results. Our approach is easy to implement and can be used by any space-partitioning index to significantly reduce the cost of range queries and intersection joins. In addition, the secondary partitions can be processed independently, which makes our method appropriate for distributed and parallel indexing. Experiments on real datasets confirm the advantage of our approach against alternative duplicate elimination techniques and data-oriented state-of-the-art spatial indices. We also show that our partitioning technique, paired with optimized partition-to-partition join algorithms, typically reduces the cost of spatial joins by around 50%.Comment: To appear in the IEEE Transactions on Knowledge and Data Engineerin

    The dicode workbench: A flexible framework for the integration of information and web services

    Get PDF
    Aiming to address requirements concerning integration of services in the context of ?big data?, this paper presents an innovative approach that (i) ensures a flexible, adaptable and scalable information and computation infrastructure, and (ii) exploits the competences of stakeholders and information workers to meaningfully confront information management issues such as information characterization, classification and interpretation, thus incorporating the underlying collective intelligence. Our approach pays much attention to the issues of usability and ease-of-use, not requiring any particular programming expertise from the end users. We report on a series of technical issues concerning the desired flexibility of the proposed integration framework and we provide related recommendations to developers of such solutions. Evaluation results are also discussed
    corecore