Search CORE

4,880 research outputs found

On Utilizing Association and Interaction Concepts for Enhancing Microaggregation in Secure Statistical Databases

Author: Fayyoumi Ebaa
Oommen B. John
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

This paper presents a possibly pioneering endeavor to tackle the microaggregation techniques (MATs) in secure statistical databases by resorting to the principles of associative neural networks (NNs). The prior art has improved the available solutions to the MAT by incorporating proximity information, and this approach is done by recursively reducing the size of the data set by excluding points that are farthest from the centroid and points that are closest to these farthest points. Thus, although the method is extremely effective, arguably, it uses only the proximity information while ignoring the mutual interaction between the records. In this paper, we argue that interrecord relationships can be quantified in terms of the following two entities: 1) their ldquoassociationrdquo and 2) their ldquointeraction.rdquo This case means that records that are not necessarily close to each other may still be ldquogrouped,rdquo because their mutual interaction, which is quantified by invoking transitive-closure-like operations on the latter entity, could be significant, as suggested by the theoretically sound principles of NNs. By repeatedly invoking the interrecord associations and interactions, the records are grouped into sizes of cardinality ldquok,rdquo where k is the security parameter in the algorithm. Our experimental results, which are done on artificial data and benchmark real-life data sets, demonstrate that the newly proposed method is superior to the state of the art not only based on the information loss (IL) perspective but also when it concerns a criterion that involves a combination of the IL and the disclosure risk (DR)

Agder University Research Archive

Mathematically optimized, recursive prepartitioning strategies for k-anonymous microaggregation of large-scale datasets

Author: Estrada Jiménez José Antonio
Forné Muñoz Jorge
Mezher Ahmad Mohamad
Pallarès Segarra Esteve
Rebollo-Monedero David
Rodríguez Hoyos Ana Fernanda
Publication venue: 'Elsevier BV'
Publication date: 11/11/2019
Field of study

© Elsevier. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/The technical contents of this work fall within the statistical disclosure control (SDC) field, which concerns the postprocessing of the demographic portion of the statistical results of surveys containing sensitive personal information, in order to effectively safeguard the anonymity of the participating respondents. A widely known technique to solve the problem of protecting the privacy of the respondents involved beyond the mere suppression of their identifiers is the k-anonymous microaggregation. Unfortunately, most microaggregation algorithms that produce competitively low levels of distortions exhibit a superlinear running time, typically scaling with the square of the number of records in the dataset. This work proposes and analyzes an optimized prepartitioning strategy to reduce significantly the running time for the k-anonymous microaggregation algorithm operating on large datasets, with mild loss in data utility with respect to that of MDAV, the underlying method. The optimization strategy is based on prepartitioning a dataset recursively until the desired k-anonymity parameter is achieved. Traditional microaggregation algorithms have quadratic computational complexity in the form T(n2). By using the proposed method and fixing the number of recurrent prepartitions we obtain subquadratic complexity in the form T(n3/2), T(n4/3), ..., depending on the number of prepartitions. Alternatively, fixing the ratio between the size of the microcell and the macrocell on each prepartition, quasilinear complexity in the form T(nlog¿n) is achieved. Our method is readily applicable to large-scale datasets with numerical demographic attributes.Peer ReviewedPostprint (author's final draft

A systematic overview on methods to protect sensitive data provided for various analyses

Author: Sariyar Murat
Templ Matthias
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

In view of the various methodological developments regarding the protection of sensitive data, especially with respect to privacy-preserving computation and federated learning, a conceptual categorization and comparison between various methods stemming from different fields is often desired. More concretely, it is important to provide guidance for the practice, which lacks an overview over suitable approaches for certain scenarios, whether it is differential privacy for interactive queries, k-anonymity methods and synthetic data generation for data publishing, or secure federated analysis for multiparty computation without sharing the data itself. Here, we provide an overview based on central criteria describing a context for privacy-preserving data handling, which allows informed decisions in view of the many alternatives. Besides guiding the practice, this categorization of concepts and methods is destined as a step towards a comprehensive ontology for anonymization. We emphasize throughout the paper that there is no panacea and that context matters

Berner Fachhochschule: ARBOR

Data Mining: The Next Generation

Author: Agrawal Rakesh
Bollinger Toni
Clifton Christopher W.
Dzeroski Saso
Freytag Johann-Christoph
Hipp Jochen
Keim Daniel
Kramer Stefan
Kriegel Hans-Peter
Leser Ulf
Liu Bing
Mannila Heikki
Meo Rosa
Morishita Shinichi
Ng Raymond
Pei Jian
Raghavan Prabhakar
Ramakrishnan Raghu
Spiliopoulou Myra
Srivastava Jaideep
Torra Vicenc
Publication venue: Dagstuhl Seminar Proceedings. 04292 - Perspectives Workshop: Data Mining: The Next Generation
Publication date: 01/01/2005
Field of study

Dagstuhl Research Online Publication Server

Legal approaches of the healthgrid technology

Author: Herveg Jean
Poullet Yves
Publication venue: s.n.
Publication date: 01/01/2004
Field of study

Privacy preservation in e-health cloud:Taxonomy, privacy requirements, feasibility analysis, and opportunities

Author: Anjum Adeel
Kanwal Tehsin
Khan Abid
Publication venue
Publication date: 01/03/2021
Field of study

The Healthgrid White Paper

Author: Breton Vincent
Dean K.
Solomonides T.
Publication venue: 'IOS Press'
Publication date: 07/04/2005
Field of study

HAL Clermont Université

New Fundamental Technologies in Data Mining

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

Private Graph Data Release: A Survey

Author: Li Yang
Ng Kee Siong
Purcell Michael
Rakotoarivelo Thierry
Ranbaduge Thilina
Smith David
Publication venue
Publication date: 09/07/2021
Field of study

The application of graph analytics to various domains have yielded tremendous societal and economical benefits in recent years. However, the increasingly widespread adoption of graph analytics comes with a commensurate increase in the need to protect private information in graph databases, especially in light of the many privacy breaches in real-world graph data that was supposed to preserve sensitive information. This paper provides a comprehensive survey of private graph data release algorithms that seek to achieve the fine balance between privacy and utility, with a specific focus on provably private mechanisms. Many of these mechanisms fall under natural extensions of the Differential Privacy framework to graph data, but we also investigate more general privacy formulations like Pufferfish Privacy that can deal with the limitations of Differential Privacy. A wide-ranging survey of the applications of private graph data release mechanisms to social networks, finance, supply chain, health and energy is also provided. This survey paper and the taxonomy it provides should benefit practitioners and researchers alike in the increasingly important area of private graph data release and analysis

arXiv.org e-Print Archive