225,060 research outputs found
Social security data mining : an Australian case study
University of Technology, Sydney. Faculty of Engineering and Information Technology.Data mining in business applications has become an increasingly recognized and accepted area of enterprise data mining in recent years. In general, while the general principle and methodologies of data mining and machine learning are applicable for any business applications, it is often essential to develop specific theories, tools and systems for mining data in a particular domain such as social security and social welfare business. This necessity has led to the concept of social security and social welfare data mining, the focus of this thesis work.
Social security and social welfare business involves almost every citizen’s life at different life periods. It provides fundamental and crucial government services and support to varied populations of specific need. A typical scenario in Australia is that it not only connects one third of our populations, but also associates with many relevant stakeholders, including banking business, taxation and Medicare. Such business engages complicated infrastructure, networks, mechanisms, policies, activities, and transactions. Data mining of such business is a brand new application area in the data mining community.
Mining such social welfare business and data is challenging. The challenges come from the unavailable benchmark and experience in the data mining for this particular domain, the complexities of social welfare business and data, the exploration of possible doable tasks, and the implementation of data mining techniques in relation to the business objectives.
In this thesis, which adopts a practice-based innovative attitude and focusses on the marriage of social welfare business with data mining, we believe we have realised our objective of providing a systematic and comprehensive overview of the social security and social welfare data mining. The main contributions consist of the following aspects:
• As the first work of its kind, to the best of our knowledge, we present an overall picture of social security and social welfare data mining, as a new domain driven data mining application.
• We explore the business nature of social security and social welfare, and the characteristics of social security data.
• We propose a concept map of social security data mining, catering for main complexities of social welfare business and data, as well as providing opportunities for exploring new research issues in the community.
• Several case studies are discussed, which demonstrate the technical development of social security data mining, and the innovative applications of existing data mining techniques.
The nature of social welfare is spreading widely across the world in both developed and developing countries. This thesis work therefore is timely and could be of important business and government value for better understanding our people, our policies, our objectives, and for better services of those people of genuine needs
Semi-Trusted Mixer Based Privacy Preserving Distributed Data Mining for Resource Constrained Devices
In this paper a homomorphic privacy preserving association rule mining
algorithm is proposed which can be deployed in resource constrained devices
(RCD). Privacy preserved exchange of counts of itemsets among distributed
mining sites is a vital part in association rule mining process. Existing
cryptography based privacy preserving solutions consume lot of computation due
to complex mathematical equations involved. Therefore less computation involved
privacy solutions are extremely necessary to deploy mining applications in RCD.
In this algorithm, a semi-trusted mixer is used to unify the counts of itemsets
encrypted by all mining sites without revealing individual values. The proposed
algorithm is built on with a well known communication efficient association
rule mining algorithm named count distribution (CD). Security proofs along with
performance analysis and comparison show the well acceptability and
effectiveness of the proposed algorithm. Efficient and straightforward privacy
model and satisfactory performance of the protocol promote itself among one of
the initiatives in deploying data mining application in RCD.Comment: IEEE Publication format, International Journal of Computer Science
and Information Security, IJCSIS, Vol. 8 No. 1, April 2010, USA. ISSN 1947
5500, http://sites.google.com/site/ijcsis
k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data
Data Mining has wide applications in many areas such as banking, medicine,
scientific research and among government agencies. Classification is one of the
commonly used tasks in data mining applications. For the past decade, due to
the rise of various privacy issues, many theoretical and practical solutions to
the classification problem have been proposed under different security models.
However, with the recent popularity of cloud computing, users now have the
opportunity to outsource their data, in encrypted form, as well as the data
mining tasks to the cloud. Since the data on the cloud is in encrypted form,
existing privacy preserving classification techniques are not applicable. In
this paper, we focus on solving the classification problem over encrypted data.
In particular, we propose a secure k-NN classifier over encrypted data in the
cloud. The proposed k-NN protocol protects the confidentiality of the data,
user's input query, and data access patterns. To the best of our knowledge, our
work is the first to develop a secure k-NN classifier over encrypted data under
the semi-honest model. Also, we empirically analyze the efficiency of our
solution through various experiments.Comment: 29 pages, 2 figures, 3 tables arXiv admin note: substantial text
overlap with arXiv:1307.482
Data Mining Applications in Banking Sector While Preserving Customer Privacy
In real-life data mining applications, organizations cooperate by using each other’s data on the same data mining task for more accurate results, although they may have different security and privacy concerns. Privacy-preserving data mining (PPDM) practices involve rules and techniques that allow parties to collaborate on data mining applications while keeping their data private. The objective of this paper is to present a number of PPDM protocols and show how PPDM can be used in data mining applications in the banking sector. For this purpose, the paper discusses homomorphic cryptosystems and secure multiparty computing. Supported by experimental analysis, the paper demonstrates that data mining tasks such as clustering and Bayesian networks (association rules) that are commonly used in the banking sector can be efficiently and securely performed. This is the first study that combines PPDM protocols with applications for banking data mining. Doi: 10.28991/ESJ-2022-06-06-014 Full Text: PD
FP-tree and COFI Based Approach for Mining of Multiple Level Association Rules in Large Databases
In recent years, discovery of association rules among itemsets in a large
database has been described as an important database-mining problem. The
problem of discovering association rules has received considerable research
attention and several algorithms for mining frequent itemsets have been
developed. Many algorithms have been proposed to discover rules at single
concept level. However, mining association rules at multiple concept levels may
lead to the discovery of more specific and concrete knowledge from data. The
discovery of multiple level association rules is very much useful in many
applications. In most of the studies for multiple level association rule
mining, the database is scanned repeatedly which affects the efficiency of
mining process. In this research paper, a new method for discovering multilevel
association rules is proposed. It is based on FP-tree structure and uses
cooccurrence frequent item tree to find frequent items in multilevel concept
hierarchy.Comment: Pages IEEE format, International Journal of Computer Science and
Information Security, IJCSIS, Vol. 7 No. 2, February 2010, USA. ISSN 1947
5500, http://sites.google.com/site/ijcsis
Efficient Privacy Preserving Distributed Clustering Based on Secret Sharing
In this paper, we propose a privacy preserving distributed
clustering protocol for horizontally partitioned data based on a very efficient
homomorphic additive secret sharing scheme. The model we use
for the protocol is novel in the sense that it utilizes two non-colluding
third parties. We provide a brief security analysis of our protocol from
information theoretic point of view, which is a stronger security model.
We show communication and computation complexity analysis of our
protocol along with another protocol previously proposed for the same
problem. We also include experimental results for computation and communication
overhead of these two protocols. Our protocol not only outperforms
the others in execution time and communication overhead on
data holders, but also uses a more efficient model for many data mining
applications
- …