1,610 research outputs found
Parametric Feature Selection for an Enhanced Random Linear Oracle Ensemble Method
Random Linear Oracle (RLO) utilized classifier fusion-selection approach by replacing each classifier with two mini-ensembles separated by an oracle. This research investigates the effect of t-test feature selection toward classification performance of RLO ensemble method. Naïve Bayes (NB) classifier has been chosen as the base classifier due to its elegant simplicity and computationally inexpensive. Experiments were carried out using 30 data sets from UCI Machine Learning Repository. The results showed that RLO ensemble could greatly improve the ability of NB classifier in dealing with more data with different properties. Moreover, RLO ensemble receives benefits from feature selection algorithm, with a properly selected number of features from ttest, the performance of ensemble can be improved
META-DES.Oracle: Meta-learning and feature selection for ensemble selection
The key issue in Dynamic Ensemble Selection (DES) is defining a suitable
criterion for calculating the classifiers' competence. There are several
criteria available to measure the level of competence of base classifiers, such
as local accuracy estimates and ranking. However, using only one criterion may
lead to a poor estimation of the classifier's competence. In order to deal with
this issue, we have proposed a novel dynamic ensemble selection framework using
meta-learning, called META-DES. An important aspect of the META-DES framework
is that multiple criteria can be embedded in the system encoded as different
sets of meta-features. However, some DES criteria are not suitable for every
classification problem. For instance, local accuracy estimates may produce poor
results when there is a high degree of overlap between the classes. Moreover, a
higher classification accuracy can be obtained if the performance of the
meta-classifier is optimized for the corresponding data. In this paper, we
propose a novel version of the META-DES framework based on the formal
definition of the Oracle, called META-DES.Oracle. The Oracle is an abstract
method that represents an ideal classifier selection scheme. A meta-feature
selection scheme using an overfitting cautious Binary Particle Swarm
Optimization (BPSO) is proposed for improving the performance of the
meta-classifier. The difference between the outputs obtained by the
meta-classifier and those presented by the Oracle is minimized. Thus, the
meta-classifier is expected to obtain results that are similar to the Oracle.
Experiments carried out using 30 classification problems demonstrate that the
optimization procedure based on the Oracle definition leads to a significant
improvement in classification accuracy when compared to previous versions of
the META-DES framework and other state-of-the-art DES techniques.Comment: Paper published on Information Fusio
Native Language Identification using Stacked Generalization
Ensemble methods using multiple classifiers have proven to be the most
successful approach for the task of Native Language Identification (NLI),
achieving the current state of the art. However, a systematic examination of
ensemble methods for NLI has yet to be conducted. Additionally, deeper ensemble
architectures such as classifier stacking have not been closely evaluated. We
present a set of experiments using three ensemble-based models, testing each
with multiple configurations and algorithms. This includes a rigorous
application of meta-classification models for NLI, achieving state-of-the-art
results on three datasets from different languages. We also present the first
use of statistical significance testing for comparing NLI systems, showing that
our results are significantly better than the previous state of the art. We
make available a collection of test set predictions to facilitate future
statistical tests
Learning to Diversify via Weighted Kernels for Classifier Ensemble
Classifier ensemble generally should combine diverse component classifiers.
However, it is difficult to give a definitive connection between diversity
measure and ensemble accuracy. Given a list of available component classifiers,
how to adaptively and diversely ensemble classifiers becomes a big challenge in
the literature. In this paper, we argue that diversity, not direct diversity on
samples but adaptive diversity with data, is highly correlated to ensemble
accuracy, and we propose a novel technology for classifier ensemble, learning
to diversify, which learns to adaptively combine classifiers by considering
both accuracy and diversity. Specifically, our approach, Learning TO Diversify
via Weighted Kernels (L2DWK), performs classifier combination by optimizing a
direct but simple criterion: maximizing ensemble accuracy and adaptive
diversity simultaneously by minimizing a convex loss function. Given a measure
formulation, the diversity is calculated with weighted kernels (i.e., the
diversity is measured on the component classifiers' outputs which are kernelled
and weighted), and the kernel weights are automatically learned. We minimize
this loss function by estimating the kernel weights in conjunction with the
classifier weights, and propose a self-training algorithm for conducting this
convex optimization procedure iteratively. Extensive experiments on a variety
of 32 UCI classification benchmark datasets show that the proposed approach
consistently outperforms state-of-the-art ensembles such as Bagging, AdaBoost,
Random Forests, Gasen, Regularized Selective Ensemble, and Ensemble Pruning via
Semi-Definite Programming.Comment: Submitted to IEEE Trans. Pattern Analysis and Machine Intelligence
(TPAMI
A DEEP analysis of the META-DES framework for dynamic selection of ensemble of classifiers
Dynamic ensemble selection (DES) techniques work by estimating the level of
competence of each classifier from a pool of classifiers. Only the most
competent ones are selected to classify a given test sample. Hence, the key
issue in DES is the criterion used to estimate the level of competence of the
classifiers in predicting the label of a given test sample. In order to perform
a more robust ensemble selection, we proposed the META-DES framework using
meta-learning, where multiple criteria are encoded as meta-features and are
passed down to a meta-classifier that is trained to estimate the competence
level of a given classifier. In this technical report, we present a
step-by-step analysis of each phase of the framework during training and test.
We show how each set of meta-features is extracted as well as their impact on
the estimation of the competence level of the base classifier. Moreover, an
analysis of the impact of several factors in the system performance, such as
the number of classifiers in the pool, the use of different linear base
classifiers, as well as the size of the validation data. We show that using the
dynamic selection of linear classifiers through the META-DES framework, we can
solve complex non-linear classification problems where other combination
techniques such as AdaBoost cannot.Comment: 47 Page
Diversity of Ensembles for Data Stream Classification
When constructing a classifier ensemble, diversity among the base classifiers
is one of the important characteristics. Several studies have been made in the
context of standard static data, in particular, when analyzing the relationship
between a high ensemble predictive performance and the diversity of its
components. Besides, ensembles of learning machines have been performed to
learn in the presence of concept drift and adapt to it. However, diversity
measures have not received much research interest in evolving data streams.
Only a few researchers directly consider promoting diversity while constructing
an ensemble or rebuilding them in the moment of detecting drifts. In this
paper, we present a theoretical analysis of different diversity measures and
relate them to the success of ensemble learning algorithms for streaming data.
The analysis provides a deeper understanding of the concept of diversity and
its impact on online ensemble Learning in the presence of concept drift. More
precisely, we are interested in answering the following research question;
Which commonly used diversity measures are used in the context of static-data
ensembles and how far are they applicable in the context of streaming data
ensembles?Comment: 9 pages, 3 tables, 3 figure
A Classifier-free Ensemble Selection Method based on Data Diversity in Random Subspaces
The Ensemble of Classifiers (EoC) has been shown to be effective in improving
the performance of single classifiers by combining their outputs, and one of
the most important properties involved in the selection of the best EoC from a
pool of classifiers is considered to be classifier diversity. In general,
classifier diversity does not occur randomly, but is generated systematically
by various ensemble creation methods. By using diverse data subsets to train
classifiers, these methods can create diverse classifiers for the EoC. In this
work, we propose a scheme to measure data diversity directly from random
subspaces, and explore the possibility of using it to select the best data
subsets for the construction of the EoC. Our scheme is the first ensemble
selection method to be presented in the literature based on the concept of data
diversity. Its main advantage over the traditional framework (ensemble creation
then selection) is that it obviates the need for classifier training prior to
ensemble selection. A single Genetic Algorithm (GA) and a Multi-Objective
Genetic Algorithm (MOGA) were evaluated to search for the best solutions for
the classifier-free ensemble selection. In both cases, objective functions
based on different clustering diversity measures were implemented and tested.
All the results obtained with the proposed classifier-free ensemble selection
method were compared with the traditional classifier-based ensemble selection
using Mean Classifier Error (ME) and Majority Voting Error (MVE). The
applicability of the method is tested on UCI machine learning problems and NIST
SD19 handwritten numerals
Learning Deep ResNet Blocks Sequentially using Boosting Theory
Deep neural networks are known to be difficult to train due to the
instability of back-propagation. A deep \emph{residual network} (ResNet) with
identity loops remedies this by stabilizing gradient computations. We prove a
boosting theory for the ResNet architecture. We construct weak module
classifiers, each contains two of the layers, such that the combined strong
learner is a ResNet. Therefore, we introduce an alternative Deep ResNet
training algorithm, \emph{BoostResNet}, which is particularly suitable in
non-differentiable architectures. Our proposed algorithm merely requires a
sequential training of "shallow ResNets" which are inexpensive. We prove
that the training error decays exponentially with the depth if the
\emph{weak module classifiers} that we train perform slightly better than some
weak baseline. In other words, we propose a weak learning condition and prove a
boosting theory for ResNet under the weak learning condition. Our results apply
to general multi-class ResNets. A generalization error bound based on margin
theory is proved and suggests ResNet's resistant to overfitting under network
with norm bounded weights.Comment: Accepted to ICML 201
Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples
Many machine learning models are vulnerable to adversarial examples: inputs
that are specially crafted to cause a machine learning model to produce an
incorrect output. Adversarial examples that affect one model often affect
another model, even if the two models have different architectures or were
trained on different training sets, so long as both models were trained to
perform the same task. An attacker may therefore train their own substitute
model, craft adversarial examples against the substitute, and transfer them to
a victim model, with very little information about the victim. Recent work has
further developed a technique that uses the victim model as an oracle to label
a synthetic training set for the substitute, so the attacker need not even
collect a training set to mount the attack. We extend these recent techniques
using reservoir sampling to greatly enhance the efficiency of the training
procedure for the substitute model. We introduce new transferability attacks
between previously unexplored (substitute, victim) pairs of machine learning
model classes, most notably SVMs and decision trees. We demonstrate our attacks
on two commercial machine learning classification systems from Amazon (96.19%
misclassification rate) and Google (88.94%) using only 800 queries of the
victim model, thereby showing that existing machine learning approaches are in
general vulnerable to systematic black-box attacks regardless of their
structure
Quantum ensembles of quantum classifiers
Quantum machine learning witnesses an increasing amount of quantum algorithms
for data-driven decision making, a problem with potential applications ranging
from automated image recognition to medical diagnosis. Many of those algorithms
are implementations of quantum classifiers, or models for the classification of
data inputs with a quantum computer. Following the success of collective
decision making with ensembles in classical machine learning, this paper
introduces the concept of quantum ensembles of quantum classifiers. Creating
the ensemble corresponds to a state preparation routine, after which the
quantum classifiers are evaluated in parallel and their combined decision is
accessed by a single-qubit measurement. This framework naturally allows for
exponentially large ensembles in which -- similar to Bayesian learning -- the
individual classifiers do not have to be trained. As an example, we analyse an
exponentially large quantum ensemble in which each classifier is weighed
according to its performance in classifying the training data, leading to new
results for quantum as well as classical machine learning.Comment: 19 pages, 9 figure
- …