2 research outputs found
Approximate Inverse Frequent Itemset Mining: Privacy, Complexity, and Approximation
In order to generate synthetic basket data sets for better benchmark testing,
it is important to integrate characteristics from real-life databases into the
synthetic basket data sets. The characteristics that could be used for this
purpose include the frequent itemsets and association rules. The problem of
generating synthetic basket data sets from frequent itemsets is generally
referred to as inverse frequent itemset mining. In this paper, we show that the
problem of approximate inverse frequent itemset mining is {\bf NP}-complete.
Then we propose and analyze an approximate algorithm for approximate inverse
frequent itemset mining, and discuss privacy issues related to the synthetic
basket data set. In particular, we propose an approximate algorithm to
determine the privacy leakage in a synthetic basket data set
Privacy preserving data generation for database application performance testing
Abstract. Synthetic data plays an important role in software testing. In this paper, we initiate the study of synthetic data generation models for the purpose of application software performance testing. In particular, we will discuss models for protecting privacy in synthetic data generations. Within this model, we investigate the feasibility and techniques for privacy preserving synthetic database generation that can be used for database application performance testing. The methodologies that we will present will be useful for general privacy preserving software performance testing.