29,641 research outputs found
An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons
In a recently published paper in JMLR, Demsar
(2006) recommends a set of non-parametric statistical tests and procedures which can be safely used for comparing the performance
of classifiers over multiple data sets. After studying the paper, we realize that the
paper correctly introduces the basic procedures and some of the most advanced
ones when comparing a control method.
However, it does not deal with some advanced
topics in depth. Regarding these topics,
we focus on more powerful proposals of statistical procedures for comparing n*n classifiers.
Moreover, we illustrate an easy way of obtaining adjusted and comparable p-values
in multiple comparison procedures.This research has been supported by the project TIN2005-08386-C05-01. S. García holds a FPU scholarship from Spanish Ministry of Education and Science
An empirical evaluation of imbalanced data strategies from a practitioner's point of view
This research tested the following well known strategies to deal with binary
imbalanced data on 82 different real life data sets (sampled to imbalance rates
of 5%, 3%, 1%, and 0.1%): class weight, SMOTE, Underbagging, and a baseline
(just the base classifier). As base classifiers we used SVM with RBF kernel,
random forests, and gradient boosting machines and we measured the quality of
the resulting classifier using 6 different metrics (Area under the curve,
Accuracy, F-measure, G-mean, Matthew's correlation coefficient and Balanced
accuracy). The best strategy strongly depends on the metric used to measure the
quality of the classifier. For AUC and accuracy class weight and the baseline
perform better; for F-measure and MCC, SMOTE performs better; and for G-mean
and balanced accuracy, underbagging
Rank discriminants for predicting phenotypes from RNA expression
Statistical methods for analyzing large-scale biomolecular data are
commonplace in computational biology. A notable example is phenotype prediction
from gene expression data, for instance, detecting human cancers,
differentiating subtypes and predicting clinical outcomes. Still, clinical
applications remain scarce. One reason is that the complexity of the decision
rules that emerge from standard statistical learning impedes biological
understanding, in particular, any mechanistic interpretation. Here we explore
decision rules for binary classification utilizing only the ordering of
expression among several genes; the basic building blocks are then two-gene
expression comparisons. The simplest example, just one comparison, is the TSP
classifier, which has appeared in a variety of cancer-related discovery
studies. Decision rules based on multiple comparisons can better accommodate
class heterogeneity, and thereby increase accuracy, and might provide a link
with biological mechanism. We consider a general framework ("rank-in-context")
for designing discriminant functions, including a data-driven selection of the
number and identity of the genes in the support ("context"). We then specialize
to two examples: voting among several pairs and comparing the median expression
in two groups of genes. Comprehensive experiments assess accuracy relative to
other, more complex, methods, and reinforce earlier observations that simple
classifiers are competitive.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS738 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Combination of linear classifiers using score function -- analysis of possible combination strategies
In this work, we addressed the issue of combining linear classifiers using
their score functions. The value of the scoring function depends on the
distance from the decision boundary. Two score functions have been tested and
four different combination strategies were investigated. During the
experimental study, the proposed approach was applied to the heterogeneous
ensemble and it was compared to two reference methods -- majority voting and
model averaging respectively. The comparison was made in terms of seven
different quality criteria. The result shows that combination strategies based
on simple average, and trimmed average are the best combination strategies of
the geometrical combination
Randomized Reference Classifier with Gaussian Distribution and Soft Confusion Matrix Applied to the Improving Weak Classifiers
In this paper, an issue of building the RRC model using probability
distributions other than beta distribution is addressed. More precisely, in
this paper, we propose to build the RRR model using the truncated normal
distribution. Heuristic procedures for expected value and the variance of the
truncated-normal distribution are also proposed. The proposed approach is
tested using SCM-based model for testing the consequences of applying the
truncated normal distribution in the RRC model. The experimental evaluation is
performed using four different base classifiers and seven quality measures. The
results showed that the proposed approach is comparable to the RRC model built
using beta distribution. What is more, for some base classifiers, the
truncated-normal-based SCM algorithm turned out to be better at discovering
objects coming from minority classes.Comment: arXiv admin note: text overlap with arXiv:1901.0882
- …