Search CORE

2,088 research outputs found

Set-Based Adaptive Distributed Differential Evolution for Anonymity-Driven Database Fragmentation

Author: Cao Jinli
Chen Zhenxiang
Ge Yongfeng
Wang Hua
Zhang Yanchun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/08/2021
Field of study

By breaking sensitive associations between attributes, database fragmentation can protect the privacy of outsourced data storage. Database fragmentation algorithms need prior knowledge of sensitive associations in the tackled database and set it as the optimization objective. Thus, the effectiveness of these algorithms is limited by prior knowledge. Inspired by the anonymity degree measurement in anonymity techniques such as k-anonymity, an anonymity-driven database fragmentation problem is defined in this paper. For this problem, a set-based adaptive distributed differential evolution (S-ADDE) algorithm is proposed. S-ADDE adopts an island model to maintain population diversity. Two set-based operators, i.e., set-based mutation and set-based crossover, are designed in which the continuous domain in the traditional differential evolution is transferred to the discrete domain in the anonymity-driven database fragmentation problem. Moreover, in the set-based mutation operator, each individual’s mutation strategy is adaptively selected according to the performance. The experimental results demonstrate that the proposed S-ADDE is significantly better than the compared approaches. The effectiveness of the proposed operators is verified

Victoria University Eprints Repository

Real-world K-Anonymity applications:The KGEN approach and its evaluation in fraudulent transactions

Author: Cascavilla Giuseppe
De Pascale Daniel
Tamburri Damian A.
Van Den Heuvel Willem-Jan
Publication venue
Publication date: 01/01/2023
Field of study

K-Anonymity is a property for the measurement, management, and governance of the data anonymization. Many implementations of k-anonymity have been described in state of the art, but most of them are not practically usable over a large number of attributes in a “Big” dataset, i.e., a dataset drawing from Big Data. To address this significant shortcoming, we introduce and evaluate KGEN, an approach to K-anonymity featuring meta-heuristics, specifically, Genetic Algorithms to compute a permutation of the dataset which is both K-anonymized and still usable for further processing, e.g., for private-by-design analytics. KGEN promotes such a meta-heuristic approach since it can solve the problem by finding a pseudo-optimal solution in a reasonable time over a considerable load of input. KGEN allows the data manager to guarantee a high anonymity level while preserving the usability and preventing loss of information entropy over the data. Differently from other approaches that provide optimal global solutions compatible with smaller datasets, KGEN works properly also over Big datasets while still providing a good-enough K-anonymized but still processable dataset. Evaluation results show how our approach can still work efficiently on a real world dataset, provided by Dutch Tax Authority, with 47 attributes (i.e., the columns of the dataset to be anonymized) and over 1.5K+ observations (i.e., the rows of that dataset), as well as on a dataset with 97 attributes and over 3942 observations.</p

Archivio istituzionale della ricerca - Politecnico di Milano

Pure OAI Repository

Should We Learn Probabilistic Models for Model Checking? A New Approach and An Empirical Study

Author: A Bauer
A Bianco
A Itai
A Mizera
C Baier
C Higuera De la
C Kermorvant
C Rohr
D Angluin
D Ron
D Tabakov
EM Clarke
EM Clarke
F He
G Norman
G Norman
HL Younes
HLS Younes
HLS Younes
I Shmulevich
JH Holland
K Havelund
K Sen
L Helmink
M Kwiatkowska
MK Reiter
RC Carrasco
RC Carrasco
T Brázdil
T Herman
Y Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Many automated system analysis techniques (e.g., model checking, model-based testing) rely on first obtaining a model of the system under analysis. System modeling is often done manually, which is often considered as a hindrance to adopt model-based system analysis and development techniques. To overcome this problem, researchers have proposed to automatically "learn" models based on sample system executions and shown that the learned models can be useful sometimes. There are however many questions to be answered. For instance, how much shall we generalize from the observed samples and how fast would learning converge? Or, would the analysis result based on the learned model be more accurate than the estimation we could have obtained by sampling many system executions within the same amount of time? In this work, we investigate existing algorithms for learning probabilistic models for model checking, propose an evolution-based approach for better controlling the degree of generalization and conduct an empirical study in order to answer the questions. One of our findings is that the effectiveness of learning may sometimes be limited.Comment: 15 pages, plus 2 reference pages, accepted by FASE 2017 in ETAP

arXiv.org e-Print Archive

Crossref

Institutional Knowledge at Singapore Management University

Open Repository and Bibliography - Luxembourg

Identity Disclosure Protection: A Data Reconstruction Approach for Preserving Privacy in Data Mining

Author: Li Xiao-Bai
Wu Shuning
Zhu Dan
Publication venue: AIS Electronic Library (AISeL)
Publication date: 31/12/2007
Field of study

AIS Electronic Library (AISeL)