Search CORE

5,327 research outputs found

Recommended from our members

Random Prism: An Alternative to Random Forests.

Author: Bramer Max
Stahl Frederic
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Ensemble learning techniques generate multiple classifiers, so called base classifiers, whose combined classification results are used in order to increase the overall classification accuracy. In most ensemble classifiers the base classifiers are based on the Top Down Induction of Decision Trees (TDIDT) approach. However, an alternative approach for the induction of rule based classifiers is the Prism family of algorithms. Prism algorithms produce modular classification rules that do not necessarily fit into a decision tree structure. Prism classification rulesets achieve a comparable and sometimes higher classification accuracy compared with decision tree classifiers, if the data is noisy and large. Yet Prism still suffers from overfitting on noisy and large datasets. In practice ensemble techniques tend to reduce the overfitting, however there exists no ensemble learner for modular classification rule inducers such as the Prism family of algorithms. This article describes the first development of an ensemble learner based on the Prism family of algorithms in order to enhance Prism’s classification accuracy by reducing overfitting

Central Archive at the University of Reading

Crossref

Portsmouth University Research Portal (Pure)

Bournemouth University Research Online

Ensembles of probability estimation trees for customer churn prediction

Author: A. Lemmens
A. Prinzie
B. Larivière
D. Cieslak
E. Bauer
F. Provost
F. Provost
J. Burez
J.J. Rodríguez
L. Breiman
L.I. Kuncheva
M.J. Shaw
N. Glady
P. Panov
R. Bryll
R. Quinlan
S. Clemençon
T.K. Ho
W. Reinartz
Y. Freund
Y.S. Kim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Customer churn prediction is one of the most, important elements tents of a company's Customer Relationship Management, (CRM) strategy In tins study, two strategies are investigated to increase the lift. performance of ensemble classification models, i.e (1) using probability estimation trees (PETs) instead of standard decision trees as base classifiers; and (n) implementing alternative fusion rules based on lift weights lot the combination of ensemble member's outputs Experiments ale conducted lot font popular ensemble strategics on five real-life chin n data sets In general, the results demonstrate how lift performance can be substantially improved by using alternative base classifiers and fusion tides However: the effect vanes lot the (Idol cut ensemble strategies lit particular, the results indicate an increase of lift performance of (1) Bagging by implementing C4 4 base classifiets. (n) the Random Subspace Method (RSM) by using lift-weighted fusion rules, and (in) AdaBoost, by implementing both

Crossref

Ghent University Academic Bibliography

Two-Stage Bagging Pruning for Reducing the Ensemble Size and Improving the Classification Performance

Author: Chen Bi
Jiang Bo
Shan Guogen
Song Yujie
Zhang Hua
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/2019
Field of study

Ensemble methods, such as the traditional bagging algorithm, can usually improve the performance of a single classifier. However, they usually require large storage space as well as relatively time-consuming predictions. Many approaches were developed to reduce the ensemble size and improve the classification performance by pruning the traditional bagging algorithms. In this article, we proposed a two-stage strategy to prune the traditional bagging algorithm by combining two simple approaches: accuracy-based pruning (AP) and distance-based pruning (DP). These two methods, as well as their two combinations, “AP+DP” and “DP+AP” as the two-stage pruning strategy, were all examined. Comparing with the single pruning methods, we found that the two-stage pruning methods can furthermore reduce the ensemble size and improve the classification. “AP+DP” method generally performs better than the “DP+AP” method when using four base classifiers: decision tree, Gaussian naive Bayes, K-nearest neighbor, and logistic regression. Moreover, as compared to the traditional bagging, the two-stage method “AP+DP” improved the classification accuracy by 0.88%, 4.06%, 1.26%, and 0.96%, respectively, averaged over 28 datasets under the four base classifiers. It was also observed that “AP+DP” outperformed other three existing algorithms Brag, Nice, and TB assessed on 8 common datasets. In summary, the proposed two-stage pruning methods are simple and promising approaches, which can both reduce the ensemble size and improve the classification accuracy

Directory of Open Access Journals

University of Nevada, Las Vegas Repository

TSE-IDS: A Two-Stage Classifier Ensemble for Intelligent Anomaly-based Intrusion Detection System

Author: Comuzzi Marco
Rhee Kyung-Hyune
Tama Bayu Adhi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Intrusion detection systems (IDS) play a pivotal role in computer security by discovering and repealing malicious activities in computer networks. Anomaly-based IDS, in particular, rely on classification models trained using historical data to discover such malicious activities. In this paper, an improved IDS based on hybrid feature selection and two-level classifier ensembles is proposed. An hybrid feature selection technique comprising three methods, i.e. particle swarm optimization, ant colony algorithm, and genetic algorithm, is utilized to reduce the feature size of the training datasets (NSL-KDD and UNSW-NB15 are considered in this paper). Features are selected based on the classification performance of a reduced error pruning tree (REPT) classifier. Then, a two-level classifier ensembles based on two meta learners, i.e., rotation forest and bagging, is proposed. On the NSL-KDD dataset, the proposed classifier shows 85.8% accuracy, 86.8% sensitivity, and 88.0% detection rate, which remarkably outperform other classification techniques recently proposed in the literature. Results regarding the UNSW-NB15 dataset also improve the ones achieved by several state of the art techniques. Finally, to verify the results, a two-step statistical significance test is conducted. This is not usually considered by IDS research thus far and, therefore, adds value to the experimental results achieved by the proposed classifier

Crossref

ScholarWorks@UNIST