8,080 research outputs found
Privacy-Preserving and Outsourced Multi-User k-Means Clustering
Many techniques for privacy-preserving data mining (PPDM) have been
investigated over the past decade. Often, the entities involved in the data
mining process are end-users or organizations with limited computing and
storage resources. As a result, such entities may want to refrain from
participating in the PPDM process. To overcome this issue and to take many
other benefits of cloud computing, outsourcing PPDM tasks to the cloud
environment has recently gained special attention. We consider the scenario
where n entities outsource their databases (in encrypted format) to the cloud
and ask the cloud to perform the clustering task on their combined data in a
privacy-preserving manner. We term such a process as privacy-preserving and
outsourced distributed clustering (PPODC). In this paper, we propose a novel
and efficient solution to the PPODC problem based on k-means clustering
algorithm. The main novelty of our solution lies in avoiding the secure
division operations required in computing cluster centers altogether through an
efficient transformation technique. Our solution builds the clusters securely
in an iterative fashion and returns the final cluster centers to all entities
when a pre-determined termination condition holds. The proposed solution
protects data confidentiality of all the participating entities under the
standard semi-honest model. To the best of our knowledge, ours is the first
work to discuss and propose a comprehensive solution to the PPODC problem that
incurs negligible cost on the participating entities. We theoretically estimate
both the computation and communication costs of the proposed protocol and also
demonstrate its practical value through experiments on a real dataset.Comment: 16 pages, 2 figures, 5 table
Privacy Preserving Utility Mining: A Survey
In big data era, the collected data usually contains rich information and
hidden knowledge. Utility-oriented pattern mining and analytics have shown a
powerful ability to explore these ubiquitous data, which may be collected from
various fields and applications, such as market basket analysis, retail,
click-stream analysis, medical analysis, and bioinformatics. However, analysis
of these data with sensitive private information raises privacy concerns. To
achieve better trade-off between utility maximizing and privacy preserving,
Privacy-Preserving Utility Mining (PPUM) has become a critical issue in recent
years. In this paper, we provide a comprehensive overview of PPUM. We first
present the background of utility mining, privacy-preserving data mining and
PPUM, then introduce the related preliminaries and problem formulation of PPUM,
as well as some key evaluation criteria for PPUM. In particular, we present and
discuss the current state-of-the-art PPUM algorithms, as well as their
advantages and deficiencies in detail. Finally, we highlight and discuss some
technical challenges and open directions for future research on PPUM.Comment: 2018 IEEE International Conference on Big Data, 10 page
Protection of big data privacy
In recent years, big data have become a hot research topic. The increasing amount of big data also increases the chance of breaching the privacy of individuals. Since big data require high computational power and large storage, distributed systems are used. As multiple parties are involved in these systems, the risk of privacy violation is increased. There have been a number of privacy-preserving mechanisms developed for privacy protection at different stages (e.g., data generation, data storage, and data processing) of a big data life cycle. The goal of this paper is to provide a comprehensive overview of the privacy preservation mechanisms in big data and present the challenges for existing mechanisms. In particular, in this paper, we illustrate the infrastructure of big data and the state-of-the-art privacy-preserving mechanisms in each stage of the big data life cycle. Furthermore, we discuss the challenges and future research directions related to privacy preservation in big data
Privacy-enhancing Aggregation of Internet of Things Data via Sensors Grouping
Big data collection practices using Internet of Things (IoT) pervasive
technologies are often privacy-intrusive and result in surveillance, profiling,
and discriminatory actions over citizens that in turn undermine the
participation of citizens to the development of sustainable smart cities.
Nevertheless, real-time data analytics and aggregate information from IoT
devices open up tremendous opportunities for managing smart city
infrastructures. The privacy-enhancing aggregation of distributed sensor data,
such as residential energy consumption or traffic information, is the research
focus of this paper. Citizens have the option to choose their privacy level by
reducing the quality of the shared data at a cost of a lower accuracy in data
analytics services. A baseline scenario is considered in which IoT sensor data
are shared directly with an untrustworthy central aggregator. A grouping
mechanism is introduced that improves privacy by sharing data aggregated first
at a group level compared as opposed to sharing data directly to the central
aggregator. Group-level aggregation obfuscates sensor data of individuals, in a
similar fashion as differential privacy and homomorphic encryption schemes,
thus inference of privacy-sensitive information from single sensors becomes
computationally harder compared to the baseline scenario. The proposed system
is evaluated using real-world data from two smart city pilot projects. Privacy
under grouping increases, while preserving the accuracy of the baseline
scenario. Intra-group influences of privacy by one group member on the other
ones are measured and fairness on privacy is found to be maximized between
group members with similar privacy choices. Several grouping strategies are
compared. Grouping by proximity of privacy choices provides the highest privacy
gains. The implications of the strategy on the design of incentives mechanisms
are discussed
- …