Search CORE

2 research outputs found

Approximate Inverse Frequent Itemset Mining: Privacy, Complexity, and Approximation

Author: Wang Yongge
Wu Xintao
Publication venue
Publication date: 23/07/2012
Field of study

In order to generate synthetic basket data sets for better benchmark testing, it is important to integrate characteristics from real-life databases into the synthetic basket data sets. The characteristics that could be used for this purpose include the frequent itemsets and association rules. The problem of generating synthetic basket data sets from frequent itemsets is generally referred to as inverse frequent itemset mining. In this paper, we show that the problem of approximate inverse frequent itemset mining is {\bf NP}-complete. Then we propose and analyze an approximate algorithm for approximate inverse frequent itemset mining, and discuss privacy issues related to the synthetic basket data set. In particular, we propose an approximate algorithm to determine the privacy leakage in a synthetic basket data set

arXiv.org e-Print Archive

Privacy preserving data generation for database application performance testing

Author: Xintao Wu
Yongge Wang
Yuliang Zheng
Publication venue: Springer-Verlag
Publication date: 01/01/2004
Field of study

Abstract. Synthetic data plays an important role in software testing. In this paper, we initiate the study of synthetic data generation models for the purpose of application software performance testing. In particular, we will discuss models for protecting privacy in synthetic data generations. Within this model, we investigate the feasibility and techniques for privacy preserving synthetic database generation that can be used for database application performance testing. The methodologies that we will present will be useful for general privacy preserving software performance testing.

CiteSeerX