Search CORE

423,418 research outputs found

Comparative analysis of methods for microbiome study

Author: Iyer Mihir Vishwanath
Publication venue
Publication date: 01/08/2020
Field of study

Microbiome analysis is garnering much interest with benefits including improved treatment options, enhanced capabilities for personalized medicine, greater understanding of the human body, and contributions to ecological study. Data from these communities of bacteria, viruses, and fungi are feature rich, sparse, and have sample sizes not appreciably larger than the feature space, making analysis challenging and necessitating a coordinated approach utilizing multiple techniques alongside domain expertise. This thesis provides an overview and comparative analysis of these methods, with a case study on cirrhosis and hepatic encephalopathy demonstrating a selection of methods. Approaches are considered in a medically motivated context where relationships between microbes in the human body and diseases or conditions are of primary interest, with additional objectives being the identification of how microbes influence each other and how these influences relate to the diseases and conditions being studied. These analysis methods are partitioned into three categories: univariate statistical methods, classifier-based methods, and joint analysis methods. Univariate statistical methods provide results corresponding to how much a single variable or feature differs between groups in the data. Classifier-based approaches can be generalized as those where a classification model with microbe abundance as inputs and disease states as outputs is used, resulting in a predictive model which is then analyzed to learn about the data. The joint analysis category corresponds to techniques which specifically target relationships between microbes and compare those relationships among subpopulations within the data. Despite significant differences between these categories and the individual methods, each has strengths and weaknesses and plays an important role in microbiome analysis

Illinois Digital Environment for Access to Learning and Scholarship Repository

Benchmarking in cluster analysis: A white paper

Author: Boulesteix Anne-Laure
Dangl Rainer
Dean Nema
Guyon Isabelle
Hennig Christian
Leisch Friedrich
Steinley Douglas
Van Mechelen Iven
Publication venue
Publication date: 01/10/2018
Field of study

To achieve scientific progress in terms of building a cumulative body of knowledge, careful attention to benchmarking is of the utmost importance. This means that proposals of new methods of data pre-processing, new data-analytic techniques, and new methods of output post-processing, should be extensively and carefully compared with existing alternatives, and that existing methods should be subjected to neutral comparison studies. To date, benchmarking and recommendations for benchmarking have been frequently seen in the context of supervised learning. Unfortunately, there has been a dearth of guidelines for benchmarking in an unsupervised setting, with the area of clustering as an important subdomain. To address this problem, discussion is given to the theoretical conceptual underpinnings of benchmarking in the field of cluster analysis by means of simulated as well as empirical data. Subsequently, the practicalities of how to address benchmarking questions in clustering are dealt with, and foundational recommendations are made

arXiv.org e-Print Archive

Proceedings - University of Groningen

ARTS repository - University of Groningen

Enlighten

Dissertations of the University of Groningen

Combined optimization of feature selection and algorithm parameters in machine learning of language

Author: Daelemans Walter
De Meulder Fien
Hoste Veronique
Naudts Bart
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Comparative machine learning experiments have become an important methodology in empirical approaches to natural language processing (i) to investigate which machine learning algorithms have the 'right bias' to solve specific natural language processing tasks, and (ii) to investigate which sources of information add to accuracy in a learning approach. Using automatic word sense disambiguation as an example task, we show that with the methodology currently used in comparative machine learning experiments, the results may often not be reliable because of the role of and interaction between feature selection and algorithm parameter optimization. We propose genetic algorithms as a practical approach to achieve both higher accuracy within a single approach, and more reliable comparisons

CiteSeerX

Ghent University Academic Bibliography

Comparing learning approaches to coreference resolution : there is more to it than 'bias'

Author: Daelemans Walter
Hoste Veronique
Publication venue
Publication date: 01/01/2005
Field of study

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Implementation of the Random Forest Method for the Imaging Atmospheric Cherenkov Telescope MAGIC

Author: A. Armada
A. Biland
A. Chilingarian
A. De Angelis
A. Herrero
A. Laille
A. Moralejo
A. Piccioli
A. Raymers
A. Robert
A. Saggion
A. Sillanpää
A. Stamerra
A. Sánchez
A. Venturini
Aharonian
Albert
Albert
Albert
Albert
B. De Lotto
Bock
Breimann
C. Baixeras
C. Bigongiari
C. Delgado
C. Merck
C.C. Hsu
D. Bastieri
D. Dorner
D. Ferenc
D. Hakobyan
D. Höhne
D. Kranich
D. Mazin
D. Nieto
D. Pascoli
D. Sobczynska
D. Tescaro
D.F. Torres
E. Aliu
E. Carmona
E. Domingo-Santamaría
E. Fernández
E. Lindfors
E. Lorenz
E. Oña-Wilhelmi
E. Prandini
F. Dazzi
F. Goebel
F. Longo
F. Pauss
F. Spanier
F. Zandanel
Fegan
G. Maneva
H. Anderhub
H. Bartko
H. Vankov
Hillas
Hillas
I. Britvitch
I. Oya
J. Albert
J. Becker
J. Cortina
J. Flix
J. Hose
J. López
J. Ninkovic
J. Rico
J. Zapatero
J.A. Barrio
J.A. Coarasa
J.L. Contreras
J.M. Miranda
J.M. Paredes
K. Berger
K. Mannheim
K. Nilsson
K. Shinozaki
Krawczynski
L. Font
L. Peruzzo
L. Takalo
L.S. Stark
Lorenz
M. Asensio
M. Camara
M. Doro
M. Errando
M. Fagiolini
M. Fuchs
M. Garczarczyk
M. Gaug
M. Giller
M. Hayashida
M. López
M. Mariotti
M. Martínez
M. Meucci
M. Meyer
M. Panniello
M. Pasanen
M. Persic
M. Ribó
M. Rissi
M. Shayduk
M. Teshima
M.T. Costado
M.V. Fonseca
N. Galante
N. Otte
N. Puchades
N. Sidro
N. Turini
P. Antoranz
P. Bordas
P. Jacon
P. Majumdar
P. Sartori
P. Temnikov
R. de los Reyes
R. Firpo
R. Kosyra
R. Kritzer
R. Mirzoyan
R. Paoletti
R. Pegna
R. Schmitt
R. Zanin
R.J. García-López
R.K. Bock
R.M. Wagner
S. Ciprini
S. Commichau
S. Huber
S. Lombardi
S. Mizobuchi
S. Rügamer
S.N. Shore
T. Bretz
T. Hengstebeck
T. Jogler
T. Schweizer
T. Wibig
T.Y. Saito
V. Bosch-Ramon
V. Curtef
V. Danielyan
V. Scalzotto
V. Scapin
V. Vitale
W. Bednarek
W. Rhode
W. Wittek
Publication venue: 'Elsevier BV'
Publication date: 08/11/2007
Field of study

The paper describes an application of the tree classification method Random Forest (RF), as used in the analysis of data from the ground-based gamma telescope MAGIC. In such telescopes, cosmic gamma-rays are observed and have to be discriminated against a dominating background of hadronic cosmic-ray particles. We describe the application of RF for this gamma/hadron separation. The RF method often shows superior performance in comparison with traditional semi-empirical techniques. Critical issues of the method and its implementation are discussed. An application of the RF method for estimation of a continuous parameter from related variables, rather than discrete classes, is also discussed.Comment: 16 pages, 8 figure

arXiv.org e-Print Archive

Docta Complutense

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Udine

Archivio della Ricerca - Università degli Studi di Siena

Publikationsserver der RWTH Aachen University

Archivio istituzionale della ricerca - Università di Padova