2,129 research outputs found
Robustness Verification for Classifier Ensembles
We give a formal verification procedure that decides whether a classifier
ensemble is robust against arbitrary randomized attacks. Such attacks consist
of a set of deterministic attacks and a distribution over this set. The
robustness-checking problem consists of assessing, given a set of classifiers
and a labelled data set, whether there exists a randomized attack that induces
a certain expected loss against all classifiers. We show the NP-hardness of the
problem and provide an upper bound on the number of attacks that is sufficient
to form an optimal randomized attack. These results provide an effective way to
reason about the robustness of a classifier ensemble. We provide SMT and MILP
encodings to compute optimal randomized attacks or prove that there is no
attack inducing a certain expected loss. In the latter case, the classifier
ensemble is provably robust. Our prototype implementation verifies multiple
neural-network ensembles trained for image-classification tasks. The
experimental results using the MILP encoding are promising both in terms of
scalability and the general applicability of our verification procedure
Robust Bayesian Linear Classifier Ensembles
The original publication is available at
http://www.springerlink.comEnsemble classifiers combine the classification results of several classifiers.
Simple ensemble methods such as uniform averaging over a set of models
usually provide an improvement over selecting the single best model. Usually probabilistic
classifiers restrict the set of possible models that can be learnt in order to
lower computational complexity costs. In these restricted spaces, where incorrect
modelling assumptions are possibly made, uniform averaging sometimes performs
even better than bayesian model averaging. Linear mixtures over sets of models provide
an space that includes uniform averaging as a particular case. We develop two
algorithms for learning maximum a posteriori weights for linear mixtures, based on
expectation maximization and on constrained optimization. We provide a nontrivial
example of the utility of these two algorithms by applying them for one dependence
estimators.We develop the conjugate distribution for one dependence estimators and
empirically show that uniform averaging is clearly superior to BMA for this family
of models. After that we empirically show that the maximum a posteriori linear mixture
weights improve accuracy significantly over uniform aggregation.Peer reviewe
TSE-IDS: A Two-Stage Classifier Ensemble for Intelligent Anomaly-based Intrusion Detection System
Intrusion detection systems (IDS) play a pivotal role in computer security by discovering and repealing malicious activities in computer networks. Anomaly-based IDS, in particular, rely on classification models trained using historical data to discover such malicious activities. In this paper, an improved IDS based on hybrid feature selection and two-level classifier ensembles is proposed. An hybrid feature selection technique comprising three methods, i.e. particle swarm optimization, ant colony algorithm, and genetic algorithm, is utilized to reduce the feature size of the training datasets (NSL-KDD and UNSW-NB15 are considered in this paper). Features are selected based on the classification performance of a reduced error pruning tree (REPT) classifier. Then, a two-level classifier ensembles based on two meta learners, i.e., rotation forest and bagging, is proposed. On the NSL-KDD dataset, the proposed classifier shows 85.8% accuracy, 86.8% sensitivity, and 88.0% detection rate, which remarkably outperform other classification techniques recently proposed in the literature. Results regarding the UNSW-NB15 dataset also improve the ones achieved by several state of the art techniques. Finally, to verify the results, a two-step statistical significance test is conducted. This is not usually considered by IDS research thus far and, therefore, adds value to the experimental results achieved by the proposed classifier
Decentralized learning with budgeted network load using Gaussian copulas and classifier ensembles
We examine a network of learners which address the same classification task
but must learn from different data sets. The learners cannot share data but
instead share their models. Models are shared only one time so as to preserve
the network load. We introduce DELCO (standing for Decentralized Ensemble
Learning with COpulas), a new approach allowing to aggregate the predictions of
the classifiers trained by each learner. The proposed method aggregates the
base classifiers using a probabilistic model relying on Gaussian copulas.
Experiments on logistic regressor ensembles demonstrate competing accuracy and
increased robustness in case of dependent classifiers. A companion python
implementation can be downloaded at https://github.com/john-klein/DELC
Diversity controlled rotating machinery fault detection
Classifier ensembles are more and more often applied for technical diagnostic problems. When dealing with vibration signals a lot of point features can be extracted. In this situation there is the problem of how to choose the best classifiers in the ensemble. One solution is the use of measures that quantify diversities amongst the classifier outputs. While there is no general diversity definition and method of calculation, the selection of the correct measure is a vital task. In this paper research is presented on the application of classifier ensembles built with Bagging for the detection of rotating machinery faults. It was found that there is a relationship between classification accuracy and the diversity measures
On Fairness, Diversity and Randomness in Algorithmic Decision Making
Consider a binary decision making process where a single machine learning classifier replaces a multitude of humans. We raise questions about the resulting loss of diversity in the decision making process. We study the potential benefits of using random classifier ensembles instead of a single classifier in the context of fairness-aware learning and demonstrate various attractive properties: (i) an ensemble of fair classifiers is guaranteed to be fair, for several different measures of fairness, (ii) an ensemble of unfair classifiers can still achieve fair outcomes, and (iii) an ensemble of classifiers can achieve better accuracy-fairness trade-offs than a single classifier. Finally, we introduce notions of distributional fairness to characterize further potential benefits of random classifier ensembles
- …