88,787 research outputs found
DISTORTION-BASED HEURISTIC METHOD FOR SENSITIVE ASSOCIATION RULE HIDING
In the past few years, privacy issues in data mining have received considerable attention in the data mining literature. However, the problem of data security cannot simply be solved by restricting data collection or against unauthorized access, it should be dealt with by providing solutions that not only protect sensitive information, but also not affect to the accuracy of the results in data mining and not violate the sensitive knowledge related with individual privacy or competitive advantage in businesses. Sensitive association rule hiding is an important issue in privacy preserving data mining. The aim of association rule hiding is to minimize the side effects on the sanitized database, which means to reduce the number of missing non-sensitive rules and the number of generated ghost rules. Current methods for hiding sensitive rules cause side effects and data loss. In this paper, we introduce a new distortion-based method to hide sensitive rules. This method proposes the determination of critical transactions based on the number of non-sensitive maximal frequent itemsets that contain at least one item to the consequent of the sensitive rule, they can be directly affected by the modified transactions. Using this set, the number of non-sensitive itemsets that need to be considered is reduced dramatically. We compute the smallest number of transactions for modification in advance to minimize the damage to the database. Comparative experimental results on real datasets showed that the proposed method can achieve better results than other methods with fewer side effects and data loss
Privacy Preserving Utility Mining: A Survey
In big data era, the collected data usually contains rich information and
hidden knowledge. Utility-oriented pattern mining and analytics have shown a
powerful ability to explore these ubiquitous data, which may be collected from
various fields and applications, such as market basket analysis, retail,
click-stream analysis, medical analysis, and bioinformatics. However, analysis
of these data with sensitive private information raises privacy concerns. To
achieve better trade-off between utility maximizing and privacy preserving,
Privacy-Preserving Utility Mining (PPUM) has become a critical issue in recent
years. In this paper, we provide a comprehensive overview of PPUM. We first
present the background of utility mining, privacy-preserving data mining and
PPUM, then introduce the related preliminaries and problem formulation of PPUM,
as well as some key evaluation criteria for PPUM. In particular, we present and
discuss the current state-of-the-art PPUM algorithms, as well as their
advantages and deficiencies in detail. Finally, we highlight and discuss some
technical challenges and open directions for future research on PPUM.Comment: 2018 IEEE International Conference on Big Data, 10 page
An Efficient Rule-Hiding Method for Privacy Preserving in Transactional Databases
One of the obstacles in using data mining techniques such as association rules is the risk of leakage of sensitive data after the data is released to the public. Therefore, a trade-off between the data privacy and data mining is of a great importance and must be managed carefully. In this study an efficient algorithm is introduced for preserving the privacy of association rules according to distortion-based method, in which the sensitive association rules are hidden through deletion and reinsertion of items in the database. In this algorithm, in order to reduce the side effects on non-sensitive rules, the item correlation between sensitive and non-sensitive rules is calculated and the item with the minimum influence in non-sensitive rules is selected as the victim item. To reduce the distortion degree on data and preservation of data quality, transactions with highest number of sensitive items are selected for modification. The results show that the proposed algorithm has a better performance in the non-dense real database having less side effects and less data loss compared to its performance in dense real database. Further the results are far better in synthetic databases in compared to real databases
Reducing Side Effects of Hiding Sensitive Itemsets in Privacy Preserving Data Mining
Data mining is traditionally adopted to retrieve and analyze knowledge from large amounts of data. Private or confidential data may be sanitized or suppressed before it is shared or published in public. Privacy preserving data mining (PPDM) has thus become an important issue in recent years. The most general way of PPDM is to sanitize the database to hide the sensitive information. In this paper, a novel hiding-missing-artificial utility (HMAU) algorithm is proposed to hide sensitive itemsets through transaction deletion. The transaction with the maximal ratio of sensitive to nonsensitive one is thus selected to be entirely deleted. Three side effects of hiding failures, missing itemsets, and artificial itemsets are considered to evaluate whether the transactions are required to be deleted for hiding sensitive itemsets. Three weights are also assigned as the importance to three factors, which can be set according to the requirement of users. Experiments are then conducted to show the performance of the proposed algorithm in execution time, number of deleted transactions, and number of side effects
Facing the Future: Financing Productive Schools
Synthesizes the School Finance Redesign Project's findings on policy options for redesigning the system to focus resources on promoting student learning. Calls for student count-based funding, integrated data collection, innovation, and accountability
Association rule hiding using integer linear programming
Privacy preserving data mining has become the focus of attention of government statistical agencies and database security research community who are concerned with preventing privacy disclosure during data mining. Repositories of large datasets include sensitive rules that need to be concealed from unauthorized access. Hence, association rule hiding emerged as one of the powerful techniques for hiding sensitive knowledge that exists in data before it is published. In this paper, we present a constraint-based optimization approach for hiding a set of sensitive association rules, using a well-structured integer linear program formulation. The proposed approach reduces the database sanitization problem to an instance of the integer linear programming problem. The solution of the integer linear program determines the transactions that need to be sanitized in order to conceal the sensitive rules while minimizing the impact of sanitization on the non-sensitive rules. We also present a heuristic sanitization algorithm that performs hiding by reducing the support or the confidence of the sensitive rules. The results of the experimental evaluation of the proposed approach on real-life datasets indicate the promising performance of the approach in terms of side effects on the original database
Statistical Proof and Theories of Discrimination
We live in a tightly knit world. Our emotions, desires, perceptions and decisions are interlinked in our interactions with others. We are constantly influencing our surroundings and being influenced by others. In this thesis, we unfold some aspects of social and economical interactions by studying empirical datasets. We project these interactions into a network representation to gain insights on how socio-economic systems form and function and how they change over time. Specifically, this thesis is centered on four main questions: How do the means of communication shape our social network structures? How can we uncover the underlying network of interests from massive observational data? How does a crisis spread in a real financial network? How do the dynamics of interaction influence spreading processes in networks? We use a variety of methods from physics, psychology, sociology, and economics as well as computational, mathematical and statistical analysis to address these questions
Firm Registration and Bribes: Results from a Microenterprise Survey in Africa
If corrupt bureaucrats target registered firms, then corruption may discourage registration. Using data from a survey of 4,801 microenterprises in Zambia, this paper looks at whether corruption is a more or less serious problem for registered firms. The paper finds results consistent with the cross-country evidence—registered firms appear to be more concerned about corruption than unregistered firms. This suggests that remaining informal and out-of-sight might reduce the burden of corruption. The paper also looks at two possible reasons why registered firms might be more concerned about corruption. It finds that there is little evidence that government officials specifically target registered firms. Registered firms were more likely to be involved in transactions with government or parastatal officials that could involve bribes—possibly explaining why they are more concerned about corruption than other firms are—but they were no more likely to pay bribes during these transactions.Zambia; Africa; Corruption; Petty Corruption; Informality; Bribes; Registration
- …