Search CORE

482 research outputs found

Mitsprache oder Effizienz: Wie begegnen Nationalstaaten demokratischen Herausforderungen? OECD-Staaten im Vergleich

Author: Bühlmann M
Müller L
Publication venue
Publication date: 23/11/2008
Field of study

Potentiometric Selectivities of Ionophore-Doped Ion-Selective Membranes: Concurrent Presence of Primary Ion or Interfering Ion Complexes of Multiple Stoichiometries

Author: Anderson Evan L.
Bühlmann Philippe
Chen Li D.
Chen Xin V.
Da Costa Rosenildo
Gladysz John A.
Yilmaz Ibrahim
Publication venue: 'American Chemical Society (ACS)'
Publication date: 23/01/2019
Field of study

The selectivities of ionophore-doped ion-selective electrode (ISE) membranes are controlled by the stability and stoichiometry of the complexes between the ionophore, L, and the target and interfering ions (Izi and Jzj, respectively). Well-accepted models predict how these selectivities can be optimized by selection of ideal ionophore-to-ionic site ratios, considering complex stoichiometries and ion charges. These models were developed for systems in which the target and interfering ions each form complexes of only one stoichiometry. However, for a few ISEs, the concurrent presence of two primary ion complexes of different stoichiometries, such as ILzi and IL2zi, was reported. Indeed, similar systems were probably often overlooked and are, in fact, more common than the exclusive formation of complexes of higher stoichiometry unless the ionophore is used in excess. Importantly, misinterpreted stoichiometries misguide the design of new ionophores and are likely to result in the formulation of ISE membranes with inferior selectivities. We show here that the presence of two or more complexes of different stoichiometries for a given ion may be inferred experimentally from careful interpretation of the potentiometric selectivities as a function of the ionophore-to-ionic site ratio or from calculations of complex concentrations using experimentally determined complex stabilities. Concurrent formation of JLzj and JL2zj complexes of an interfering ion is shown here to shift the ionophore-to-ionic site ratio that provides the highest selectivities. Formation of ILn–1zi and ILnzi complexes of a primary ion is less of a concern because an optimized membrane typically contains an excess of ionophore, but lower than expected selectivities may be observed if the stepwise complex formation constant, KILn, is not sufficiently large and the ionophore-to-ionic site ratio does not markedly exceed n

University of South Wales Research Explorer

FigShare

Noisy Monte Carlo: Convergence of Markov chains with approximate transition kernels

Author: A Caimo
A Dalalyan
A. Boland
AY Mitrophanov
C Andrieu
G Golub
G Robins
GO Roberts
GO Roberts
GO Roberts
H Robbins
J Møller
J Propp
J-M Marin
JE Besag
L Bottou
L Valiant
M Girolami
MA Beaumont
N Friel
N Friel
N Friel
N. Friel
NV Kartashov
P Bühlmann
P. Alquier
R Reeves
R Tibshirani
R. Everitt
S Geman
S Meyn
W Gilks
Publication venue
Publication date: 15/04/2014
Field of study

Monte Carlo algorithms often aim to draw from a distribution

\pi

by simulating a Markov chain with transition kernel

P

such that

\pi

is invariant under

P

. However, there are many situations for which it is impractical or impossible to draw from the transition kernel

P

. For instance, this is the case with massive datasets, where is it prohibitively expensive to calculate the likelihood and is also the case for intractable likelihood models arising from, for example, Gibbs random fields, such as those found in spatial statistics and network analysis. A natural approach in these cases is to replace

P

by an approximation

\hat{P}

. Using theory from the stability of Markov chains we explore a variety of situations where it is possible to quantify how 'close' the chain given by the transition kernel

\hat{P}

is to the chain given by

P

. We apply these results to several examples from spatial statistics and network analysis.Comment: This version: results extended to non-uniformly ergodic Markov chain

arXiv.org e-Print Archive

Central Archive at the University of Reading

Crossref

Research Repository UCD

Irish Universities

Warwick Research Archives Portal Repository

On the combination of omics data for prediction of binary outcomes

Author: A. E. Hoerl
A. J. Vickers
A. Kakourou
B. J. A. Mertens
B. J. A. Mertens
B. J. A. Mertens
D. J. Hand
D. J. Hand
D. R. Cox
E. W. Steyerberg
G. Tutz
H. Liu
H. Zou
L. Breiman
L. Meier
M. E. Noo de
M. J. Pencina
M. Leblanc
M. S. Pepe
M. Stone
P. Bühlmann
P. Jonathan
R. Tibshirani
S. Cessie Le
T. Hastie
T. Kneib
Publication venue
Publication date: 14/10/2016
Field of study

Enrichment of predictive models with new biomolecular markers is an important task in high-dimensional omic applications. Increasingly, clinical studies include several sets of such omics markers available for each patient, measuring different levels of biological variation. As a result, one of the main challenges in predictive research is the integration of different sources of omic biomarkers for the prediction of health traits. We review several approaches for the combination of omic markers in the context of binary outcome prediction, all based on double cross-validation and regularized regression models. We evaluate their performance in terms of calibration and discrimination and we compare their performance with respect to single-omic source predictions. We illustrate the methods through the analysis of two real datasets. On the one hand, we consider the combination of two fractions of proteomic mass spectrometry for the calibration of a diagnostic rule for the detection of early-stage breast cancer. On the other hand, we consider transcriptomics and metabolomics as predictors of obesity using data from the Dietary, Lifestyle, and Genetic determinants of Obesity and Metabolic syndrome (DILGOM) study, a population-based cohort, from Finland

arXiv.org e-Print Archive

Crossref

Selection of tuning parameters in bridge regression models via Bayesian information criterion

Author: A Antoniadis
AE Hoerl
C Park
C-H Zhang
CM Hurvich
CM Hurvich
G McDonald
G Schwarz
H Zou
H Zou
J Fan
J Fan
J Huang
J Huang
J Lv
JE Frank
K Knight
L Tierney
M Ishiguro
M Yuan
N Sugiura
P Bühlmann
P Craven
PHC Eilers
R Tibshirani
S Konishi
Shuichi Kawano
T Shimamura
WJ Fu
Y Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/04/2012
Field of study

We consider the bridge linear regression modeling, which can produce a sparse or non-sparse model. A crucial point in the model building process is the selection of adjusted parameters including a regularization parameter and a tuning parameter in bridge regression models. The choice of the adjusted parameters can be viewed as a model selection and evaluation problem. We propose a model selection criterion for evaluating bridge regression models in terms of Bayesian approach. This selection criterion enables us to select the adjusted parameters objectively. We investigate the effectiveness of our proposed modeling strategy through some numerical examples.Comment: 20 pages, 5 figure

arXiv.org e-Print Archive

Crossref

GLMMLasso: An Algorithm for High-Dimensional Generalized Linear Mixed Models Using L1-Penalization

Author: Bertsekas D.P.
Friedman J.
Groll A.
Ibrahim J.G.
Jiang J.
Jürg Schelldorfer
Lai R.
Lukas Meier
McCulloch C.E.
Molenberghs G.
Ni X.
Pan W.
Peter Bühlmann
Schelldorfer J.
Tibshirani R.
van de Geer S.
Xue L.
Zhou S.
Publication venue: 'Informa UK Limited'
Publication date: 20/11/2012
Field of study

We propose an L1-penalized algorithm for fitting high-dimensional generalized linear mixed models. Generalized linear mixed models (GLMMs) can be viewed as an extension of generalized linear models for clustered observations. This Lasso-type approach for GLMMs should be mainly used as variable screening method to reduce the number of variables below the sample size. We then suggest a refitting by maximum likelihood based on the selected variables only. This is an effective correction to overcome problems stemming from the variable screening procedure which are more severe with GLMMs. We illustrate the performance of our algorithm on simulated as well as on real data examples. Supplemental materials are available online and the algorithm is implemented in the R package glmmixedlasso

arXiv.org e-Print Archive

Crossref

Conditional variable importance for random forests

Author: A Bureau
Achim Zeileis
Anne-Laure Boulesteix
BJ van Os
C Strobl
C Strobl
C Strobl
Carolin Strobl
E Bauer
JH Silber
K Nicodemus
KJ Archer
KL Lunetta
L Breiman
L Breiman
L Breiman
L Breiman
L Breiman
M Nason
MR Segal
Mvan der Laan
P Bühlmann
P Good
R Development Core Team
R Diaz-Uriarte
R Diaz-Uriarte
R Feraud
SM Stigler
T Hastie
T Hothorn
TG Dietterich
Thomas Augustin
Thomas Kneib
V Svetnik
W Rodenburg
X Huang
X Xia
Y Lin
Y Qi
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Random forests are becoming increasingly popular in many scientific fields because they can cope with ``small n large p'' problems, complex interactions and even highly correlated predictor variables. Their variable importance measures have recently been suggested as screening tools for, e.g., gene expression studies. However, these variable importance measures show a bias towards correlated predictor variables. We identify two mechanisms responsible for this finding: (i) A preference for the selection of correlated predictors in the tree building process and (ii) an additional advantage for correlated predictor variables induced by the unconditional permutation scheme that is employed in the computation of the variable importance measure. Based on these considerations we develop a new, conditional permutation scheme for the computation of the variable importance measure. The resulting conditional variable importance is shown to reflect the true impact of each predictor variable more reliably than the original marginal approach

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Open Access LMU

Elektronische Publikationen der Wirtschaftsuniversität Wien

Prediction intervals for future BMI values of individual children - a non-parametric approach by quantile boosting

Author: A Beyerlein
A Mayr
Andreas Mayr
B Efron
F Sassi
I Jansen
JB Copas
JH Friedman
JJ Reilly
L Breiman
M Dehghan
N Fenske
N Fenske
N Meinshausen
N Meinshausen
Nora Fenske
P Bühlmann
R Development Core Team
R Koenker
R Koenker
R Koenker
R Tibshirani
R Whitaker
RA Rigby
T Hastie
T Hastie
T Hothorn
T Hothorn
T Kneib
Torsten Hothorn
Y Wei
Y Wei
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Background: The construction of prediction intervals (PIs) for future body mass index (BMI) values of individual children based on a recent German birth cohort study with n = 2007 children is problematic for standard parametric approaches, as the BMI distribution in childhood is typically skewed depending on age. Methods: We avoid distributional assumptions by directly modelling the borders of PIs by additive quantile regression, estimated by boosting. We point out the concept of conditional coverage to prove the accuracy of PIs. As conditional coverage can hardly be evaluated in practical applications, we conduct a simulation study before fitting child- and covariate-specific PIs for future BMI values and BMI patterns for the present data. Results: The results of our simulation study suggest that PIs fitted by quantile boosting cover future observations with the predefined coverage probability and outperform the benchmark approach. For the prediction of future BMI values, quantile boosting automatically selects informative covariates and adapts to the age-specific skewness of the BMI distribution. The lengths of the estimated PIs are child-specific and increase, as expected, with the age of the child. Conclusions: Quantile boosting is a promising approach to construct PIs with correct conditional coverage in a non-parametric way. It is in particular suitable for the prediction of BMI patterns depending on covariates, since it provides an interpretable predictor structure, inherent variable selection properties and can even account for longitudinal data structures

Crossref

Springer - Publisher Connector

PubMed Central

Open Access LMU

Bias in random forest variable importance measures: Illustrations, sources and a solution

Author: A Bureau
A Dobra
A Liaw
Achim Zeileis
AG Heidema
AL Boulesteix
AL Boulesteix
Anne-Laure Boulesteix
C Furlanello
C Strobl
C Strobl
C Strobl
Carolin Strobl
DN Politis
EC Gunther
H Kim
I Kononenko
J Friedman
J Friedman
K Arun
KL Lunetta
L Breiman
L Breiman
L Breiman
M van der Laan
MM Ward
MP Cummings
MP Cummings
MR Segal
P Bühlmann
PJ Bickel
R Development Core Team
R Díaz-Uriarte
R Guha
T Hothorn
T Hothorn
TM Therneau
Torsten Hothorn
V Svetnik
X Huang
Y Qi
Y Shih
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Variable importance measures for random forests have been receiving increased attention as a means of variable selection in many classification tasks in bioinformatics and related scientific fields, for instance to select a subset of genetic markers relevant for the prediction of a certain disease. We show that random forest variable importance measures are a sensible means for variable selection in many applications, but are not reliable in situations where potential predictor variables vary in their scale of measurement or their number of categories. This is particularly important in genomics and computational biology, where predictors often include variables of different types, for example when predictors include both sequence data and continuous variables such as folding energy, or when amino acid sequence data show different numbers of categories. RESULTS: Simulation studies are presented illustrating that, when random forest variable importance measures are used with data of varying types, the results are misleading because suboptimal predictor variables may be artificially preferred in variable selection. The two mechanisms underlying this deficiency are biased variable selection in the individual classification trees used to build the random forest on one hand, and effects induced by bootstrap sampling with replacement on the other hand. CONCLUSION: We propose to employ an alternative implementation of random forests, that provides unbiased variable selection in the individual classification trees. When this method is applied using subsampling without replacement, the resulting variable importance measures can be used reliably for variable selection even in situations where the potential predictor variables vary in their scale of measurement or their number of categories. The usage of both random forest algorithms and their variable importance measures in the R system for statistical computing is illustrated and documented thoroughly in an application re-analyzing data from a study on RNA editing. Therefore the suggested method can be applied straightforwardly by scientists in bioinformatics research

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Open Access LMU

Elektronische Publikationen der Wirtschaftsuniversität Wien

A boosting method for maximizing the partial area under the ROC curve

Author: A Ben-Dor
BG Lugosi
D Bamber
G Tutz
J Friedman
J Neyman
L Hadjiiski
LE Dodd
LJ van't Veer
M Dettling
MJ Pencina
MS Pepe
MS Pepe
MS Pepe
MS Pepe
MW McIntosh
N Murata
NR Cook
O Komori
Osamu Komori
P Bühlmann
P Zhao
S Eguchi
S Ma
SG Baker
Shinto Eguchi
T Cai
TT Golub
Y Freund
Y Qi
Z Wang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The receiver operating characteristic (ROC) curve is a fundamental tool to assess the discriminant performance for not only a single marker but also a score function combining multiple markers. The area under the ROC curve (AUC) for a score function measures the intrinsic ability for the score function to discriminate between the controls and cases. Recently, the partial AUC (pAUC) has been paid more attention than the AUC, because a suitable range of the false positive rate can be focused according to various clinical situations. However, existing pAUC-based methods only handle a few markers and do not take nonlinear combination of markers into consideration. Results We have developed a new statistical method that focuses on the pAUC based on a boosting technique. The markers are combined componentially for maximizing the pAUC in the boosting algorithm using natural cubic splines or decision stumps (single-level decision trees), according to the values of markers (continuous or discrete). We show that the resulting score plots are useful for understanding how each marker is associated with the outcome variable. We compare the performance of the proposed boosting method with those of other existing methods, and demonstrate the utility using real data sets. As a result, we have much better discrimination performances in the sense of the pAUC in both simulation studies and real data analysis. Conclusions The proposed method addresses how to combine the markers after a pAUC-based filtering procedure in high dimensional setting. Hence, it provides a consistent way of analyzing data based on the pAUC from maker selection to marker combination for discrimination problems. The method can capture not only linear but also nonlinear association between the outcome variable and the markers, about which the nonlinearity is known to be necessary in general for the maximization of the pAUC. The method also puts importance on the accuracy of classification performance as well as interpretability of the association, by offering simple and smooth resultant score plots for each marker.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central