Search CORE

4 research outputs found

Reduced Data Sets and Entropy-Based Discretization

Author: Grzymala-Busse Jerzy W.
Hippe Zdzislaw S.
Mroczek Teresa
Publication venue: 'MDPI AG'
Publication date: 28/10/2019
Field of study

This work is licensed under a Creative Commons Attribution 4.0 International License.Results of experiments on numerical data sets discretized using two methods—global versions of Equal Frequency per Interval and Equal Interval Width-are presented. Globalization of both methods is based on entropy. For discretized data sets left and right reducts were computed. For each discretized data set and two data sets, based, respectively, on left and right reducts, we applied ten-fold cross validation using the C4.5 decision tree generation system. Our main objective was to compare the quality of all three types of data sets in terms of an error rate. Additionally, we compared complexity of generated decision trees. We show that reduction of data sets may only increase the error rate and that the decision trees generated from reduced decision sets are not simpler than the decision trees generated from non-reduced data sets

Multidisciplinary Digital Publishing Institute

KU ScholarWorks

Recommended from our members

A Comparison of Three Voting Methods for Bagging with the MLEM2 Algorithm

Author: Cohagan Clinton
Grzymala-Busse Jerzy W.
Hippe Zdzislaw S.
Publication venue: Kansas City Plant (U.S.)
Publication date: 17/03/2010
Field of study

This paper presents results of experiments on some data sets using bagging on the MLEM2 rule induction algorithm. Three different methods of ensemble voting, based on support (a non-democratic voting in which ensembles vote with their strengths), strength only (an ensemble with the largest strength decides to which concept a case belongs) and democratic voting (each ensemble has at most one vote) were used. Our conclusions are that though in most cases democratic voting was the best, it is not significantly better than voting based on support. The strength voting was the worst voting method

UNT Digital Library

Reduced Data Sets and Entropy-Based Discretization

Author: Bertolazzi
Fuernkranz
Garey
Grzymala-Busse
Grzymala-Busse
Grzymala-Busse
Jerzy W. Grzymala-Busse
Liu
Nguyen
Pawlak
Quinlan
Stefanowski
Swiniarski
Teresa Mroczek
Zdzislaw S. Hippe
Publication venue: 'MDPI AG'
Publication date
Field of study

Crossref