Search CORE

9 research outputs found

Review Paper - High Utility Item sets Mining on Incremental Transactions using UP-Growth and UP-Growth+ Algorithm

Author: Miss. A. A. Bhosale, S. V. Patil, Miss. P. M. Tare, Miss. P. S. Kadam
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/11/2014
Field of study

One of the important research area in data mining is high utility pattern mining. Discovering itemsets with high utility like profit from database is known as high utility itemset mining. There are number of existing algorithms have been work on this issue. Some of them incurs problem of generating large number of candidate itemsets. This leads to degrade the performance of mining in case of execution time and space. In this paper we have focus on UP-Growth and UP-Growth+ algorithm which overcomes this limitation. This technique uses tree based data structure, UP-Tree for generating candidate itemsets with two scan of database. In this paper we extend the functionality of these algorithms on incremental database.

International Journal on Recent and Innovation Trends in Computing and Communication

MBiS: an efficient method for mining frequent weighted utility itemsets from quantitative databases

Author: Bảy Võ Đình
Ham Nguyen Duy
Hong Tzung-Pei
Minh Nguyen Thi Hong
Publication venue: 'Publishing House for Science and Technology, Vietnam Academy of Science and Technology'
Publication date: 16/03/2015
Field of study

In recent years, methods for mining quantitative databases have been proposed. However, the processing time is fairly much, which affects the productivity of intelligent systems in the use of quantitative databases. This study proposes the multi-bit segment (MBiS) structure to store and process tidsets to increase the effeciency of mining frequent weighted utility itemsets (FWUIs) from quantitative databases. With this structure, the calculation of the intersection of tidsets between two itemsets becomes more convenient. Based on this structure, the authors define the MBiS-Tree structure and propose an algorithm for mining FWUIs from quantitative databases. Experimental results for a number of databases show that the proposed method outperforms existing methods

Vietnam Academy of Science and Technology: Journals Online

AN EFFICIENT ALGORITHM FORMINING HIGH UTILITY ASSOCIATION RULES FROM LATTICE

Author: Nguyen Loan T.T.
Nguyen Trinh D.D.
Tran Quyen
Vo Bay
Publication venue: 'Publishing House for Science and Technology, Vietnam Academy of Science and Technology'
Publication date: 11/05/2020
Field of study

In business, most of companies focus on growing their profits. Besides considering profit from each product, they also focus on the relationship among products in order to support effective decision making, gain more profits and attract their customers, e.g. shelf arrangement, product displays, or product marketing, etc. Some high utility association rules have been proposed, however, they consume much memory and require long time processing. This paper proposes LHAR (Lattice-based for mining High utility Association Rules) algorithm to mine high utility association rules based on a lattice of high utility itemsets. The LHAR algorithm aims to generates high utility association rules during the process of building lattice of high utility itemsets, and thus it needs less memory and runtim

Vietnam Academy of Science and Technology: Journals Online

A Novel Approach to Extract High Utility Itemsets from Distributed Databases

Author: Kandhasamy Premalatha
Subramanian Kannimuthu
Subramanian Shankar
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 30/01/2013
Field of study

Traditional approaches in data mining focus on support and confidence measures which are just statistics based. Support and confidence measures which are based on the frequency count of the items enable us to derive the frequent itemsets. The frequency of the items as a single factor does not represent the interestingness of the items. To enhance the process of data mining tasks based on the value of the product, several researches were conducted. It resulted in utility mining which is an emerging field of research in data mining. In the recent years various data mining approaches have been implemented in order to find the high utility itemsets. The main objective of utility mining is to identify the itemsets with highest utilities, by considering the subjectively defined utility values, as set by the user. Existing methods based on utility mining concept focus on centralized systems where the data and associated processing is pertained to a particular location. As a further step ahead we try to implement the utility mining concept in a distributed environment. In this approach we use a sophisticated way of mining high utility itemsets using a Fast Utility Mining (FUM) algorithm

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

KHAI THÁC TẬP MỤC LỢI ÍCH CAO CÓ LỢI NHUẬN ÂM TRONG CƠ SỞ DỮ LIỆU PHÂN TÁN DỌC

Author: Anh Cao Tùng
Huy Ngô Quốc
Khang Võ Hoàng
Publication venue: Dalat University (Trường Đại học Đà Lạt)
Publication date: 30/09/2020
Field of study

High Utility Itemset (HUI) mining is an important problem in the data mining literature that considers the utilities for businesses of items (such as profits and margins) that are discovered from transactional databases. There are many algorithms for mining high utility itemsets (HUIs) by pruning candidates based on estimated and transaction-weighted utilization values. These algorithms aim to reduce the search space. In this paper, we propose a method for mining HUIs with negative unit profits from vertically distributed databases. This method does not integrate databases from the relevant local databases to form a centralized database. Experiments show that the run-time of this method is more efficient than that of the centralized database.Tập lợi ích cao (TLIC) là một vấn đề quan trọng trong khai phá dữ liệu, xem xét các lợi ích của các mục (chẳng hạn như lợi nhuận và lãi suất) được khám phá từ cơ sở dữ liệu (CSDL) giao dịch hỗ trợ cho việc kinh doanh của các đơn vị. Bài báo trình bày một phương pháp khai thác tập lợi ích cao có lợi nhuận âm trên CSDL phân tán dọc. Việc khai thác tập lợi ích cao đã được nghiên cứu và công bố rộng rãi trong những năm gần đây. Có nhiều thuật toán khai thác các tập lợi ích cao (TLIC) bằng cách cắt tỉa các ứng cử viên dựa trên các giá trị lợi ích và dựa trên các giá trị sử dụng có trọng số giao dịch. Các thuật toán này đều hướng tới mục đích làm giảm không gian tìm kiếm. Trong bài báo này, chúng tôi đề xuất một phương pháp khai thác tập lợi ích cao có lợi nhuận âm (TLIC-TSA) từ CSDL phân tán dọc. Phương pháp này không tích hợp CSDL từ CSDL cục bộ của các bên tham gia để hình thành CSDL tập trung và chỉ thực hiện việc quét các CSDL mỗi bên tham gia một lần. Các thí nghiệm cho thấy thời gian chạy của phương pháp này hiệu quả hơn so với khai thác trên cơ sở dữ liệu tập trung

Dalat University Journal of Science / Tạp chí khoa học Đại học Đà Lạt

Pemanfaatan Algoritma WIT-Tree dan HITS untuk Klasifikasi Tingkat Keberhasilan Pemberdayaan Keluarga Miskin

Author: Khomsah Siti
Winarko Edi
Publication venue: 'Universitas Gadjah Mada'
Publication date: 01/01/2017
Field of study

The successful rate of the poor families empowerment can be classified by characteristic patterns extracted from the database that contains the data of the poor families empowerment. The purpose of this research is to build a classification model to predict the level of success from poor families, who will receive assistance empowerment of poverty. Classification models built with WARM, which is combining two methods, they are HITS and WIT-tree. HITS is used to obtained the weight of the attributes from the database. The weights are used as the attributes’s weight on methods WIT-tree. WIT-tree is used to generate the association rules that satisfy a minimum weight support and minimum weight confidence. The data used was 831 sample data poor families that divided into two classes, namely poor families in the standard of "developing" and poor families in the level of "underdeveloped". The performance of classification model shows, weighting attribute using HITS approaches the accuracy of 86.45% and weighted attributes defined by the user approaches the accuracy of 66.13%. This study shows that the weight of the attributes obtained from HITS is better than the weight of the attributes specified by the user

Directory of Open Access Journals

IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

A new method for mining Frequent Weighted Itemsets based on WIT-trees

Author: Bac Le
Bay Vo
Frans Coenen
Hong
Hong
Hong
Hong
Le
Le
Lin
Vo
Vo
Vo
Vo
Zaki
Zaki
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

A Novel Algorithm for Mining High Utility Itemsets

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

A NOVEL ALGORITHM FOR MINING HIGH UTILITY ITEMSETS

Author
Publication venue: 'International Journal of Advance Engineering and Research Development (IJAERD)'
Publication date
Field of study

Crossref