Search CORE

14,247 research outputs found

An enhanced intelligent database engine by neural network and data mining

Author: Chua Boon Lay
Khalid Marzuki
Yusof Rubiyah
Publication venue
Publication date: 24/09/2000
Field of study

An Intelligent Database Engine (IDE) is developed to solve any classification problem by providing two integrated features: decision-making by a backpropagation (BP) neural network (NN) and decision support by Apriori, a data mining (DM) algorithm. Previous experimental results show the accuracy of NN (90%) and DM (60%) to be drastically distinct. Thus, efforts to improve DM accuracy is crucial to ensure a well-balanced hybrid architecture. The poor DM performance is caused by either too few rules or too many poor rules which are generated in the classifier. Thus, the first problem is curbed by generating multiple level rules, by incorporating multiple attribute support and level confidence to the initial Apriori. The second problem is tackled by implementing two strengthening procedures, confidence and Bayes verification to filter out the unpredictive rules. Experiments with more datasets are carried out to compare the performance of initial and improved Apriori. Great improvement is obtained for the latte

Universiti Teknologi Malaysia Institutional Repository

The Bases of Association Rules of High Confidence

Author: Adaricheva Kira
Cabot-Miller Justin
Nation J. B.
Segal Oren
Sharafudinov Anuar
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 05/08/2018
Field of study

We develop a new approach for distributed computing of the association rules of high confidence in a binary table. It is derived from the D-basis algorithm in K. Adaricheva and J.B. Nation (TCS 2017), which is performed on multiple sub-tables of a table given by removing several rows at a time. The set of rules is then aggregated using the same approach as the D-basis is retrieved from a larger set of implications. This allows to obtain a basis of association rules of high confidence, which can be used for ranking all attributes of the table with respect to a given fixed attribute using the relevance parameter introduced in K. Adaricheva et al. (Proceedings of ICFCA-2015). This paper focuses on the technical implementation of the new algorithm. Some testing results are performed on transaction data and medical data.Comment: Presented at DTMN, Sydney, Australia, July 28, 201

arXiv.org e-Print Archive

Crossref

Mining Frequent Itemsets Using Genetic Algorithm

Author: Biswas Sushanta
Ghosh Soumadip
Sarkar Debasree
Sarkar Partha Pratim
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/11/2010
Field of study

In general frequent itemsets are generated from large data sets by applying association rule mining algorithms like Apriori, Partition, Pincer-Search, Incremental, Border algorithm etc., which take too much computer time to compute all the frequent itemsets. By using Genetic Algorithm (GA) we can improve the scenario. The major advantage of using GA in the discovery of frequent itemsets is that they perform global search and its time complexity is less compared to other algorithms as the genetic algorithm is based on the greedy approach. The main aim of this paper is to find all the frequent itemsets from given data sets using genetic algorithm

arXiv.org e-Print Archive

CiteSeerX

Crossref

HybridMiner: Mining Maximal Frequent Itemsets Using Hybrid Database Representation Approach

Author: Baig Abdul Rauf
Bashir Shariq
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/04/2009
Field of study

In this paper we present a novel hybrid (arraybased layout and vertical bitmap layout) database representation approach for mining complete Maximal Frequent Itemset (MFI) on sparse and large datasets. Our work is novel in terms of scalability, item search order and two horizontal and vertical projection techniques. We also present a maximal algorithm using this hybrid database representation approach. Different experimental results on real and sparse benchmark datasets show that our approach is better than previous state of art maximal algorithms.Comment: 8 Pages In the proceedings of 9th IEEE-INMIC 2005, Karachi, Pakistan, 200

arXiv.org e-Print Archive

Crossref