Search CORE

40 research outputs found

Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees

Author: Fokkema M.
Hothorn T.
Kelderman H.
Smits N.
Zeileis A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees

Author: Fokkema M.
Kelderman H.
Smits N.
Zeileis A. Hothorn T.
Publication venue
Publication date: 01/01/2018
Field of study

Identification of subgroups of patients for whom treatment A is more effective than treatment B, and vice versa, is of key importance to the development of personalized medicine. Tree-based algorithms are helpful tools for the detection of such interactions, but none of the available algorithms allow for taking into account clustered or nested dataset structures, which are particularly common in psychological research. Therefore, we propose the generalized linear mixed-effects model tree (GLMM tree) algorithm, which allows for the detection of treatment-subgroup interactions, while accounting for the clustered structure of a dataset. The algorithm uses model-based recursive partitioning to detect treatment-subgroup interactions, and a GLMM to estimate the random-effects parameters. In a simulation study, GLMM trees show higher accuracy in recovering treatment-subgroup interactions, higher predictive accuracy, and lower type II error rates than linear-model-based recursive partitioning and mixed-effects regression trees. Also, GLMM trees show somewhat higher predictive accuracy than linear mixed-effects models with pre-specified interaction effects, on average. We illustrate the application of GLMM trees on an individual patient-level data meta-analysis on treatments for depression. We conclude that GLMM trees are a promising exploratory tool for the detection of treatment-subgroup interactions in clustered datasets.Article / Letter to editorInstituut Psychologi

Leiden University Scholary Publications

Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees

Author: Fokkema M.
Hothorn T.
Kelderman H.
Smits N.
Zeileis A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/10/2017
Field of study

Leiden University Scholary Publications

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Bias in random forest variable importance measures: Illustrations, sources and a solution

Author: A Bureau
A Dobra
A Liaw
Achim Zeileis
AG Heidema
AL Boulesteix
AL Boulesteix
Anne-Laure Boulesteix
C Furlanello
C Strobl
C Strobl
C Strobl
Carolin Strobl
DN Politis
EC Gunther
H Kim
I Kononenko
J Friedman
J Friedman
K Arun
KL Lunetta
L Breiman
L Breiman
L Breiman
M van der Laan
MM Ward
MP Cummings
MP Cummings
MR Segal
P Bühlmann
PJ Bickel
R Development Core Team
R Díaz-Uriarte
R Guha
T Hothorn
T Hothorn
TM Therneau
Torsten Hothorn
V Svetnik
X Huang
Y Qi
Y Shih
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Variable importance measures for random forests have been receiving increased attention as a means of variable selection in many classification tasks in bioinformatics and related scientific fields, for instance to select a subset of genetic markers relevant for the prediction of a certain disease. We show that random forest variable importance measures are a sensible means for variable selection in many applications, but are not reliable in situations where potential predictor variables vary in their scale of measurement or their number of categories. This is particularly important in genomics and computational biology, where predictors often include variables of different types, for example when predictors include both sequence data and continuous variables such as folding energy, or when amino acid sequence data show different numbers of categories. RESULTS: Simulation studies are presented illustrating that, when random forest variable importance measures are used with data of varying types, the results are misleading because suboptimal predictor variables may be artificially preferred in variable selection. The two mechanisms underlying this deficiency are biased variable selection in the individual classification trees used to build the random forest on one hand, and effects induced by bootstrap sampling with replacement on the other hand. CONCLUSION: We propose to employ an alternative implementation of random forests, that provides unbiased variable selection in the individual classification trees. When this method is applied using subsampling without replacement, the resulting variable importance measures can be used reliably for variable selection even in situations where the potential predictor variables vary in their scale of measurement or their number of categories. The usage of both random forest algorithms and their variable importance measures in the R system for statistical computing is illustrated and documented thoroughly in an application re-analyzing data from a study on RNA editing. Therefore the suggested method can be applied straightforwardly by scientists in bioinformatics research

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Open Access LMU

Elektronische Publikationen der Wirtschaftsuniversität Wien

Conditional variable importance for random forests

Author: A Bureau
Achim Zeileis
Anne-Laure Boulesteix
BJ van Os
C Strobl
C Strobl
C Strobl
Carolin Strobl
E Bauer
JH Silber
K Nicodemus
KJ Archer
KL Lunetta
L Breiman
L Breiman
L Breiman
L Breiman
L Breiman
M Nason
MR Segal
Mvan der Laan
P Bühlmann
P Good
R Development Core Team
R Diaz-Uriarte
R Diaz-Uriarte
R Feraud
SM Stigler
T Hastie
T Hothorn
TG Dietterich
Thomas Augustin
Thomas Kneib
V Svetnik
W Rodenburg
X Huang
X Xia
Y Lin
Y Qi
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Random forests are becoming increasingly popular in many scientific fields because they can cope with ``small n large p'' problems, complex interactions and even highly correlated predictor variables. Their variable importance measures have recently been suggested as screening tools for, e.g., gene expression studies. However, these variable importance measures show a bias towards correlated predictor variables. We identify two mechanisms responsible for this finding: (i) A preference for the selection of correlated predictors in the tree building process and (ii) an additional advantage for correlated predictor variables induced by the unconditional permutation scheme that is employed in the computation of the variable importance measure. Based on these considerations we develop a new, conditional permutation scheme for the computation of the variable importance measure. The resulting conditional variable importance is shown to reflect the true impact of each predictor variable more reliably than the original marginal approach

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Open Access LMU

Elektronische Publikationen der Wirtschaftsuniversität Wien

A rare exception to Haldane's rule: are X chromosomes key to hybrid incompatibilities?

Author: A Dey
A Qvarnström
A Telschow
A Zeileis
AD Cutter
AF Zuur
AN Brothers
AW Davis
B Charlesworth
B Mantovani
BC Phillips
C Fraïsse
C Wu
CC Laurie
CD Meiklejohn
DC Presgraves
DC Presgraves
DC Presgraves
DC Presgraves
ET Watson
F Ohmachi
GC Woodruff
H Hollocher
HA Orr
HA Orr
HA Orr
HA Orr
HA Orr
HJ Muller
J Martín-Coello
JA Coyne
JA Coyne
JA Coyne
JBS Haldane
JD Bundus
JH Malone
JL Kozlowska
JM Ranz
JP Masly
JR Spence
K Sawamura
K Sawamura
LF Delph
LW Simmons
M G Ritchie
M Schilthuizen
M Turelli
M Turelli
M Turelli
MJ Wade
N Davies
N W Bailey
NA Johnson
NA Johnson
P A Moran
P Michalak
PG Fontana
S Maheshwari
SA Frank
SE Baird
SE Baird
SR McDermott
SR Virdee
SR Virdee
T Dobzhansky
T Hothorn
T Koevoets
T Mousseau
TA Abe
TA Suzuki
TW Hogan
WN Venables
Y Tao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

This work was funded by NERC (NE/G014906/1, NE/L011255/1, NE/I027800/1). Additional funding from the Orthopterists’ Society to PM is also gratefully acknowledged.The prevalence of Haldane’s rule suggests that sex chromosomes commonly have a key role in reproductive barriers and speciation. However, the majority of research on Haldane’s rule has been conducted in species with conventional sex determination systems (XY and ZW) and exceptions to the rule have been understudied. Here we test the role of X-linked incompatibilities in a rare exception to Haldane’s rule for female sterility in field cricket sister species (Teleogryllus oceanicus and T. commodus). Both have an XO sex determination system. Using three generations of crosses, we introgressed X chromosomes from each species onto different, mixed genomic backgrounds to test predictions about the fertility and viability of each cross type. We predicted that females with two different species X chromosomes would suffer reduced fertility and viability compared with females with two parental X chromosomes. However, we found no strong support for such X-linked incompatibilities. Our results preclude X–X incompatibilities and instead support an interchromosomal epistatic basis to hybrid female sterility. We discuss the broader implications of these findings, principally whether deviations from Haldane’s rule might be more prevalent in species without dimorphic sex chromosomes.PostprintPeer reviewe

Crossref

University of St. Andrews - Pure

St Andrews Research Repository

Are Poultry or Wild Birds the Main Reservoirs for Avian Influenza in Bangladesh?

Author: A Caron
A Zeileis
AP Shanbhag
B Olsen
BJ Hoye
BJ Hoye
C Adlhoch
C Lebarbenchon
CF Basler
D Bates
DJ Prosser
G Chowell
H Nishiura
H Tian
HG Heine
J Bahl
J Cappelle
J Druce
J Keawcharoen
J Verhagen
J-K Kim
JD Brown
JL Samantha
JM Curran
JS Peiris
KJ Vandegrift
M Gilbert
M Gilbert
M Walsh
MA Hoque
MA Hoque
Marcel Klaassen
Mat Yamage
Md. Ahasanul Hoque
Mohammad Mahmudul Hassan
N Tanimura
Nitish Chandra Debnath
PK Biswas
PS Wikramaratna
RG Webster
S Koul
SH Newman
SN Bevins
SR Fereidouni
SU Khan
T Hothorn
TL Fuller
TM Ellis
Y Li
Y Si
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Alumni giving at a small liberal arts college: evidence from consistent and occasional donors

Author: Hornik K.
Hothorn T.
van de Wiel M.A.
Zeileis A.
Publication venue
Publication date: 01/01/2005
Field of study

Conditioning on the observed data is an important and flexible design principle for statistical test procedures. Although generally applicable, permutation tests currently in use are limited to the treatment of special cases, such as contingency tables or K-sample problems. A new theoretical framework for permutation tests opens up the way to a unified and generalized view. We argue that the transfer of such a theory to practical data analysis has important implications in many applications and requires tools that enable the data analyst to compute on the theoretical concepts as closely as possible. We re-analyze four data sets by adapting the general conceptual framework to these non-standard inference procedures and utilizing the coin add-on package in the R system for statistical computing to show what one can gain from going beyond the `classical' test procedures.Series: Research Report Series / Department of Statistics and Mathematic

Elektronische Publikationen der Wirtschaftsuniversität Wien

Research Papers in Economics