340 research outputs found

    Exploratory Analysis of Benchmark Experiments -- An Interactive Approach

    Get PDF
    The analysis of benchmark experiments consists in a large part of exploratory methods, especially visualizations. In Eugster et al. [2008] we presented a comprehensive toolbox including the bench plot. This plot visualizes the behavior of the algorithms on the individual drawn learning and test samples according to specific performance measures. In this paper we show an interactive version of the bench plot can easily uncover details and relations unseen with the static version

    Bench Plot and Mixed Effects Models: First steps toward a comprehensive benchmark analysis toolbox

    Get PDF
    Benchmark experiments produce data in a very specific format. The observations are drawn from the performance distributions of the candidate algorithms on resampled data sets. In this paper we introduce new visualisation techniques and show how formal test procedures can be used to evaluate the results. This is the first step towards a comprehensive toolbox of exploratory and inferential analysis methods for benchmark experiments

    Spider-Man, the Child and the Trickster -- Archetypal Analysis in R

    Get PDF
    Archetypal analysis has the aim to represent observations in a multivariate data set as convex combinations of extremal points. This approach was introduced by Cutler and Breiman (1994); they defined the concrete problem, laid out the theoretical foundations and presented an algorithm written in Fortran, which is available on request. In this paper we present the R package archetypes which is available on the Comprehensive R Archive Network. The package provides an implementation of the archetypal analysis algorithm within R and different exploratory tools to analyze the algorithm during its execution and its final result. The application of the package is demonstrated on two examples

    Weighted and Robust Archetypal Analysis

    Get PDF
    Archetypal analysis represents observations in a multivariate data set as convex combinations of a few extremal points lying on the boundary of the convex hull. Data points which vary from the majority have great influence on the solution; in fact one outlier can break down the archetype solution. This paper adapts the original algorithm to be a robust M-estimator and presents an iteratively reweighted least squares fitting algorithm. As required first step, the weighted archetypal problem is formulated and solved. The algorithm is demonstrated using both an artificial and a real world example

    From Spider-Man to Hero - Archetypal Analysis in R

    Get PDF
    Archetypal analysis has the aim to represent observations in a multivariate data set as convex combinations of extremal points. This approach was introduced by Cutler and Breiman (1994); they defined the concrete problem, laid out the theoretical foundations and presented an algorithm written in Fortran. In this paper we present the R package archetypes which is available on the Comprehensive R Archive Network. The package provides an implementation of the archetypal analysis algorithm within R and different exploratory tools to analyze the algorithm during its execution and its final result. The application of the package is demonstrated on two examples.

    Exploratory and Inferential Analysis of Benchmark Experiments

    Get PDF
    Benchmark experiments produce data in a very specific format. The observations are drawn from the performance distributions of the candidate algorithms on resampled data sets. In this paper we introduce a comprehensive toolbox of exploratory and inferential analysis methods for benchmark experiments based on one or more data sets. We present new visualization techniques, show how formal non-parametric and parametric test procedures can be used to evaluate the results, and, finally, how to sum up to a statistically correct overall order of the candidate algorithms

    (Psycho-)Analysis of Benchmark Experiments

    Get PDF
    It is common knowledge that certain characteristics of data sets -- such as linear separability or sample size -- determine the performance of learning algorithms. In this paper we propose a formal framework for investigations on this relationship. The framework combines three, in their respective scientific discipline well-established, methods. Benchmark experiments are the method of choice in machine and statistical learning to compare algorithms with respect to a certain performance measure on particular data sets. To realize the interaction between data sets and algorithms, the data sets are characterized using statistical and information-theoretic measures; a common approach in the field of meta learning to decide which algorithms are suited to particular data sets. Finally, the performance ranking of algorithms on groups of data sets with similar characteristics is determined by means of recursively partitioning Bradley-Terry models, that are commonly used in psychology to study the preferences of human subjects. The result is a tree with splits in data set characteristics which significantly change the performances of the algorithms. The main advantage is the automatic detection of these important characteristics. The framework is introduced using a simple artificial example. Its real-word usage is demonstrated by means of an application example consisting of thirteen well-known data sets and six common learning algorithms. All resources to replicate the examples are available online

    Bench Plot and Mixed Effects Models: First steps toward a comprehensiv benchmark analysis toolbox

    Get PDF
    Benchmark experiments produce data in a very specific format. The observations are drawn from the performance distributions of the candidate algorithms on resampled data sets. In this paper we introduce new visualisation techniques and show how formal test procedures can be used to evaluate the results. This is the first step towards a comprehensive toolbox of exploratory and inferential analysis methods for benchmark experiments

    Morphology of obligate ectosymbionts reveals Paralaxus gen. nov.: A new circumtropical genus of marine stilbonematine nematodes

    Get PDF
    Stilbonematinae are a subfamily of conspicuous marine nematodes, distinguished by a coat of sulphur‐oxidizing bacterial ectosymbionts on their cuticle. As most nematodes, the worm hosts have a relatively simple anatomy and few taxonomically informative characters, and this has resulted in numerous taxonomic reassignments and synonymizations. Recent studies using a combination of morphological and molecular traits have helped to improve the taxonomy of Stilbonematinae but also raised questions on the validity of several genera. Here, we describe a new circumtropically distributed genus Paralaxus (Stilbonematinae) with three species: Paralaxus cocos sp. nov., P. bermudensis sp. nov. and P. columbae sp. nov. We used single worm metagenomes to generate host 18S rRNA and cytochrome c oxidase I (COI) as well as symbiont 16S rRNA gene sequences. Intriguingly, COI alignments and primer matching analyses suggest that the COI is not suitable for PCR‐based barcoding approaches in Stilbonematinae as the genera have a highly diverse base composition and no conserved primer sites. The phylogenetic analyses of all three gene sets, however, confirm the morphological assignments and support the erection of the new genus Paralaxus as well as corroborate the status of the other stilbonematine genera. Paralaxus most closely resembles the stilbonematine genus Laxus in overlapping sets of diagnostic features but can be distinguished from Laxus by the morphology of the genus‐specific symbiont coat. Our re‐analyses of key parameters of the symbiont coat morphology as character for all Stilbonematinae genera show that with amended descriptions, including the coat, highly reliable genus assignments can be obtained
    corecore