262 research outputs found
Privacy Preservation by Disassociation
In this work, we focus on protection against identity disclosure in the
publication of sparse multidimensional data. Existing multidimensional
anonymization techniquesa) protect the privacy of users either by altering the
set of quasi-identifiers of the original data (e.g., by generalization or
suppression) or by adding noise (e.g., using differential privacy) and/or (b)
assume a clear distinction between sensitive and non-sensitive information and
sever the possible linkage. In many real world applications the above
techniques are not applicable. For instance, consider web search query logs.
Suppressing or generalizing anonymization methods would remove the most
valuable information in the dataset: the original query terms. Additionally,
web search query logs contain millions of query terms which cannot be
categorized as sensitive or non-sensitive since a term may be sensitive for a
user and non-sensitive for another. Motivated by this observation, we propose
an anonymization technique termed disassociation that preserves the original
terms but hides the fact that two or more different terms appear in the same
record. We protect the users' privacy by disassociating record terms that
participate in identifying combinations. This way the adversary cannot
associate with high probability a record with a rare combination of terms. To
the best of our knowledge, our proposal is the first to employ such a technique
to provide protection against identity disclosure. We propose an anonymization
algorithm based on our approach and evaluate its performance on real and
synthetic datasets, comparing it against other state-of-the-art methods based
on generalization and differential privacy.Comment: VLDB201
Features of recording practices and communication during nursing handover: a cluster analysis
Objective: To record and identify the characteristics of nursing handovers in a tertiary hospital. Method: Observational study. Twenty-two nurses participated in 11 nursing handovers in 2015/16, using a recorded audio system and an unstructured observation form. Hierarchical cluster analysis was performed. Results: Thirty characteristics were identified. The nursing handovers were based on the clinical status of patients, and all nurses obtained specialized scientific knowledge specific to the clinical environment. The information used was not based on nursing diagnoses and not in accordance with best nursing clinical practice. The following four clusters emerged among the 30 characteristics: 1) the use of evidence-based nursing practice, 2) the nonuse of evidencebased nursing practice and its correlation with strained psychological environment, 3) patient management and the clinical skills/knowledge of nurses, and 4) handover content, quality of information transferred and specialization. Conclusion: Multiple characteristics were observed. The majority of characteristics were grouped based on common features, and 4 main clusters emerged. The investigation and understanding of structural relations between these characteristics and their respective clusters may lead to an improvement in the quality of nursing health care services
In-Memory Interval Joins
The interval join is a popular operation in temporal, spatial, and uncertain databases. The majority of interval join algorithms assume that input data reside on disk and so, their focus is to minimize the I/O accesses. Recently, an in-memory approach based on plane sweep (PS) for modern hardware was proposed which greatly outperforms previous work. However, this approach relies on a complex data structure and its parallelization has not been adequately studied. In this article, we investigate in-memory interval joins in two directions. First, we explore the applicability of a largely ignored forward scan (FS)-based plane sweep algorithm, for single-threaded join evaluation. We propose four optimizations for FS that greatly reduce its cost, making it competitive or even faster than the state-of-the-art. Second, we study in depth the parallel computation of interval joins. We design a non-partitioning-based approach that determines independent tasks of the join algorithm to run in parallel. Then, we address the drawbacks of the previously proposed hash-based partitioning and suggest a domain-based partitioning approach that does not produce duplicate results. Within our approach, we propose a novel breakdown of the partition-joins into mini-joins to be scheduled in the available CPU threads and propose an adaptive domain partitioning, aiming at load balancing. We also investigate how the partitioning phase can benefit from modern parallel hardware. Our thorough experimental analysis demonstrates the advantage of our novel partitioning-based approach for parallel computation
Parallel In-Memory Evaluation of Spatial Joins
The spatial join is a popular operation in spatial database systems and its
evaluation is a well-studied problem. As main memories become bigger and faster
and commodity hardware supports parallel processing, there is a need to revamp
classic join algorithms which have been designed for I/O-bound processing. In
view of this, we study the in-memory and parallel evaluation of spatial joins,
by re-designing a classic partitioning-based algorithm to consider alternative
approaches for space partitioning. Our study shows that, compared to a
straightforward implementation of the algorithm, our tuning can improve
performance significantly. We also show how to select appropriate partitioning
parameters based on data statistics, in order to tune the algorithm for the
given join inputs. Our parallel implementation scales gracefully with the
number of threads reducing the cost of the join to at most one second even for
join inputs with tens of millions of rectangles.Comment: Extended version of the SIGSPATIAL'19 paper under the same titl
41P. Practical Lessons Learned while Developing Web 2.0 Collaboration Services for Communities of Practice
Although a plethora of Web 2.0 applications exist today, there is little literature reporting on experiences, concrete recommendations or best practices when developing such applications. The scarcity of such records makes it difficult for developers to determine how best to support the practices of communities with the use of Web 2.0 technologies. In this paper, we report on eight practical lessons learned while developing Web 2.0 collaboration services for Communities of Practice in the framework of a three years long European research project on Technology Enhanced Learning. The main objective of this project was to investigate how Web 2.0 technologies could impact the communication and collaboration needs of Communities of Practice interacting online and, conversely, how new interaction needs could impact Web 2.0 technology. The above lessons are presented in a way that could aid people engaged in various phases of the development of Web-based collaboration support services
Web Observatories: Concepts, State Of The Art & Beyond
The Web Observatories are becoming common Internet practice. They are web sites targeting a community of practitioners, scientists or generally individuals within the context of a focused organization. Their goal is to inform, educate, facilitate the interaction and boost the collaboration of community members. Various existing technologies can be deployed for this purpose. Still, their integration into a coherent informational and collaborative environment remains largely ad hoc. In this paper we attempt to elucidate the concept web observatory and identify its characteristics and practices
Two-layer Space-oriented Partitioning for Non-point Data
Non-point spatial objects (e.g., polygons, linestrings, etc.) are ubiquitous.
We study the problem of indexing non-point objects in memory for range queries
and spatial intersection joins. We propose a secondary partitioning technique
for space-oriented partitioning indices (e.g., grids), which improves their
performance significantly, by avoiding the generation and elimination of
duplicate results. Our approach is easy to implement and can be used by any
space-partitioning index to significantly reduce the cost of range queries and
intersection joins. In addition, the secondary partitions can be processed
independently, which makes our method appropriate for distributed and parallel
indexing. Experiments on real datasets confirm the advantage of our approach
against alternative duplicate elimination techniques and data-oriented
state-of-the-art spatial indices. We also show that our partitioning technique,
paired with optimized partition-to-partition join algorithms, typically reduces
the cost of spatial joins by around 50%.Comment: To appear in the IEEE Transactions on Knowledge and Data Engineerin
The dicode workbench: A flexible framework for the integration of information and web services
Aiming to address requirements concerning integration of services in the context of ?big data?, this paper presents an innovative approach that (i) ensures a flexible, adaptable and scalable information and computation infrastructure, and (ii) exploits the competences of stakeholders and information workers to meaningfully confront information management issues such as information characterization, classification and interpretation, thus incorporating the underlying collective intelligence. Our approach pays much attention to the issues of usability and ease-of-use, not requiring any particular programming expertise from the end users. We report on a series of technical issues concerning the desired flexibility of the proposed integration framework and we provide related recommendations to developers of such solutions. Evaluation results are also discussed
- …