2 research outputs found

    Approximate Inverse Frequent Itemset Mining: Privacy, Complexity, and Approximation

    Full text link
    In order to generate synthetic basket data sets for better benchmark testing, it is important to integrate characteristics from real-life databases into the synthetic basket data sets. The characteristics that could be used for this purpose include the frequent itemsets and association rules. The problem of generating synthetic basket data sets from frequent itemsets is generally referred to as inverse frequent itemset mining. In this paper, we show that the problem of approximate inverse frequent itemset mining is {\bf NP}-complete. Then we propose and analyze an approximate algorithm for approximate inverse frequent itemset mining, and discuss privacy issues related to the synthetic basket data set. In particular, we propose an approximate algorithm to determine the privacy leakage in a synthetic basket data set

    Privacy preserving data generation for database application performance testing

    No full text
    Abstract. Synthetic data plays an important role in software testing. In this paper, we initiate the study of synthetic data generation models for the purpose of application software performance testing. In particular, we will discuss models for protecting privacy in synthetic data generations. Within this model, we investigate the feasibility and techniques for privacy preserving synthetic database generation that can be used for database application performance testing. The methodologies that we will present will be useful for general privacy preserving software performance testing.
    corecore