Search CORE

117,598 research outputs found

Classifier selection with permutation tests

Author: Arias Vicente Marta
Arratia Quesada Argimiro Alejandro
Duarte López Ariel
Publication venue: 'IOS Press'
Publication date: 01/01/2017
Field of study

This work presents a content-based recommender system for machine learning classifier algorithms. Given a new data set, a recommendation of what classifier is likely to perform best is made based on classifier performance over similar known data sets. This similarity is measured according to a data set characterization that includes several state-of-the-art metrics taking into account physical structure, statistics, and information theory. A novelty with respect to prior work is the use of a robust approach based on permutation tests to directly assess whether a given learning algorithm is able to exploit the attributes in a data set to predict class labels, and compare it to the more commonly used F-score metric for evaluating classifier performance. To evaluate our approach, we have conducted an extensive experimentation including 8 of the main machine learning classification methods with varying configurations and 65 binary data sets, leading to over 2331 experiments. Our results show that using the information from the permutation test clearly improves the quality of the recommendations.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

UPCommons (Universitat Politècnica de Catalunya)

Improved Error Bounds Based on Worst Likely Assignments

Author: Bax Eric
Publication venue
Publication date: 31/03/2015
Field of study

Error bounds based on worst likely assignments use permutation tests to validate classifiers. Worst likely assignments can produce effective bounds even for data sets with 100 or fewer training examples. This paper introduces a statistic for use in the permutation tests of worst likely assignments that improves error bounds, especially for accurate classifiers, which are typically the classifiers of interest.Comment: IJCNN 201

arXiv.org e-Print Archive

Crossref

Direction-Projection-Permutation for High Dimensional Hypothesis Tests

Author: Lee Chihoon
Li Gen
Marron J. S.
Wei Susan
Wichers Lindsay
Publication venue
Publication date: 02/04/2013
Field of study

Motivated by the prevalence of high dimensional low sample size datasets in modern statistical applications, we propose a general nonparametric framework, Direction-Projection-Permutation (DiProPerm), for testing high dimensional hypotheses. The method is aimed at rigorous testing of whether lower dimensional visual differences are statistically significant. Theoretical analysis under the non-classical asymptotic regime of dimension going to infinity for fixed sample size reveals that certain natural variations of DiProPerm can have very different behaviors. An empirical power study both confirms the theoretical results and suggests DiProPerm is a powerful test in many settings. Finally DiProPerm is applied to a high dimensional gene expression dataset

arXiv.org e-Print Archive

CiteSeerX

Exact and asymptotically robust permutation tests

Author: Chung EunYi
Romano Joseph P.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/04/2013
Field of study

Given independent samples from P and Q, two-sample permutation tests allow one to construct exact level tests when the null hypothesis is P=Q. On the other hand, when comparing or testing particular parameters

\theta

of P and Q, such as their means or medians, permutation tests need not be level

\alpha

, or even approximately level

\alpha

in large samples. Under very weak assumptions for comparing estimators, we provide a general test procedure whereby the asymptotic validity of the permutation test holds while retaining the exact rejection probability

\alpha

in finite samples when the underlying distributions are identical. The ideas are broadly applicable and special attention is given to the k-sample problem of comparing general parameters, whereby a permutation test is constructed which is exact level

\alpha

under the hypothesis of identical distributions, but has asymptotic rejection probability

\alpha

under the more general null hypothesis of equality of parameters. A Monte Carlo simulation study is performed as well. A quite general theory is possible based on a coupling construction, as well as a key contiguity argument for the multinomial and multivariate hypergeometric distributions.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1090 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref