Search CORE

108,414 research outputs found

Incremental algorithm for association rule mining under dynamic threshold

Author: Aqra I
Ghani NA
Machado J
Maple C
Safa NS
Publication venue: MDPI AG
Publication date: 10/12/2019
Field of study

Data mining is essentially applied to discover new knowledge from a database through an iterative process. The mining process may be time consuming for massive datasets. A widely used method related to knowledge discovery domain refers to association rule mining (ARM) approach, despite its shortcomings in mining large databases. As such, several approaches have been prescribed to unravel knowledge. Most of the proposed algorithms addressed data incremental issues, especially when a hefty amount of data are added to the database after the latest mining process. Three basic manipulation operations performed in a database include add, delete, and update. Any method devised in light of data incremental issues is bound to embed these three operations. The changing threshold is a long-standing problem within the data mining field. Since decision making refers to an active process, the threshold is indeed changeable. Accordingly, the present study proposes an algorithm that resolves the issue of rescanning a database that had been mined previously and allows retrieval of knowledge that satisfies several thresholds without the need to learn the process from scratch. The proposed approach displayed high accuracy in experimentation, as well as reduction in processing time by almost two-thirds of the original mining execution time

E-space: Manchester Metropolitan University's Research Repository

Incremental algorithm for association rule mining under dynamic threshold

Author: Abdul Ghani Norjihan
Aqra Iyad
Machado José
Maple Carsten
Sohrabi Safa Nader
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

© 2019 The Authors. Published by MDPI AG. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.3390/app9245398Data mining is essentially applied to discover new knowledge from a database through an iterative process. The mining process may be time consuming for massive datasets. A widely used method related to knowledge discovery domain refers to association rule mining (ARM) approach, despite its shortcomings in mining large databases. As such, several approaches have been prescribed to unravel knowledge. Most of the proposed algorithms addressed data incremental issues, especially when a hefty amount of data are added to the database after the latest mining process. Three basic manipulation operations performed in a database include add, delete, and update. Any method devised in light of data incremental issues is bound to embed these three operations. The changing threshold is a long-standing problem within the data mining field. Since decision making refers to an active process, the threshold is indeed changeable. Accordingly, the present study proposes an algorithm that resolves the issue of rescanning a database that had been mined previously and allows retrieval of knowledge that satisfies several thresholds without the need to learn the process from scratch. The proposed approach displayed high accuracy in experimentation, as well as reduction in processing time by almost two-thirds of the original mining execution time.This research was funded by University Malaya through a postgraduate research grant (PPP) grant number PG106-2015B.Published onlin

Universidade do Minho: RepositoriUM

Warwick Research Archives Portal Repository

Coventry University Pure Portal

Wolverhampton Intellectual Repository and E-theses

Randomized Response Technique in Data Mining

Author: Monika Soni
Publication venue: Auricle Global Society of Education and Research
Publication date: 25/02/2016
Field of study

Data mining is a process in which data is collected from different sources and resume it in useful information. Data mining is also known as knowledge discovery in database (KDD).Privacy and accuracy are the important issues in data mining when data is shared. A fruitful direction for future data mining research will be the development of techniques that incorporate privacy concerns. Most of the methods use random permutation techniques to mask the data, for preserving the privacy of sensitive data. Randomize response techniques were developed for the purpose of protecting surveys privacy and avoiding answers bias mainly. In RR technique it adds certain degree of randomness to the answer to prevent the data. The objective of this thesis is to enhance the privacy level in RR technique using four group schemes. First according to the algorithm random attributes a, b, c, d were considered, Then the randomization have been performed on every dataset according to the values of theta. Then ID3 and CART algorithm was applied on the randomized data. The result shows that by increasing the group, the privacy level will increase

International Journal on Future Revolution in Computer Science & Communication Engineering

Randomized Response Technique in Data Mining

Author: Monika Soni, Vishal Shrivastva
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/06/2013
Field of study

Data mining is a process in which data is collected from different sources and resume it in useful information. Data mining is also known as knowledge discovery in database (KDD). Privacy and accuracy are the important issues in data mining when data is shared. A fruitful direction for future data mining research will be the development of techniques that incorporate privacy concerns. Most of the methods use random permutation techniques to mask the data, for preserving the privacy of sensitive data. Randomize response techniques were developed for the purpose of protecting surveys privacy and avoiding answers bias mainly. In RR technique it adds certain degree of randomness to the answer to prevent the data. The objective of this thesis is t o enhance the privacy level in RR technique using four group schemes. First according to the algorithm random attributes a, b, c, d wer e considered, Then the randomization have been performed on every dataset according to the values of theta. Then ID3 and CART algorithm was applied on the randomized data. The result shows that by increasing the group, the privacy level will increase

International Journal on Recent and Innovation Trends in Computing and Communication

Randomized Response Technique in Data Mining

Author: Monika Soni
Publication venue: Auricle Global Society of Education and Research
Publication date: 25/02/2016
Field of study

International Journal on Future Revolution in Computer Science & Communication Engineering

Adopting Data Mining as a Knowledge Discovery Tool: The Influential Factors from the Perspectives of Information Systems Managers

Author: H. Al-Faouri A.
Publication venue: Arab Journals Platform
Publication date: 29/04/2023
Field of study

Data mining is the process of discovering patterns from large sets of data, based on methods at the intersection of machine learning, statistics, and database systems. As a form of knowledge discovery, the process uncovers concealed patterns to forecast possible results. To meet this objective, this study has applied a cross-sectional quantitative research approach. The data was gathered from managers in the fields of Information Technology (IT) and information systems (IS) of large companies operating in different e-commerce, digital businesses, and marketing in Jordan. The data was then gathered and analyzed. With a total of 309 responses collected in this study, the results were reached using structural equation modeling via Analysis of Moments Structure (AMOS V.21). The proposed conceptual model confirmed that all the identified variables associated with positive coefficients of data mining adoption with data warehouse, data accuracy, perceived usefulness, perceived ease of use, as well as Information System performance. Moreover, the study concluded with research insights related to this topic with further suggested research directed to expand the grasp in this field, and provide deeper understanding of the data mining related issues

Arab Journals Platform

A Genetic Programming Framework for Two Data Mining Tasks: Classification and Generalized Rule Induction

Author: Freitas Alex A.
Publication venue: Morgan Kaufmann
Publication date: 01/01/1997
Field of study

This paper proposes a genetic programming (GP) framework for two major data mining tasks, namely classification and generalized rule induction. The framework emphasizes the integration between a GP algorithm and relational database systems. In particular, the fitness of individuals is computed by submitting SQL queries to a (parallel) database server. Some advantages of this integration from a data mining viewpoint are scalability, data-privacy control and automatic parallelization

CiteSeerX

Kent Academic Repository