Search CORE

223 research outputs found

How acceptable are antiretrovirals for the prevention of sexually transmitted HIV? A review of research on the acceptability of oral pre-exposure prophylaxis and treatment as prevention

Author: A Aghaizu
A Bourne
A Persson
AB Eisingerich
AM Minnis
BG Williams
BHIVA
BS Mensch
D Havlir
D Rey
D Rojas Castro
DK Smith
DS Krakower
E Elm von
EA Barash
EM Elst Van der
ET Roberts
F Griensven van
F Zhou
G Guest
G Guest
G Mansergh
G Mutua
GR Galindo
IM Poynten
Ingrid Young
International Collaboration on HIV Optimism
JBF Wit de
JC Dombrowski
JT Galea
K Amico
KR Amico
KR Amico
Lisa McDaid
LJ Severy
M Holt
M Holt
M Leonardi
M Mascolini
MJ Laar van de
MJ Mimiaga
MS Cohen
N Crepaz
N Lorente
N Nodin
NC Ware
NS Padian
P Saberi
PL Anderson
PL Vernazza
RA Brooks
RA Brooks
S Kippax
S Kippax
S McCormack
SA Golub
SA Golub
SC Kalichman
T Jackson
V Puro
VK Nguyen
Y Chen
YO Whiteside
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Recent research has demonstrated how antiretrovirals (ARVs) could be effective in the prevention of sexually transmitted HIV. We review research on the acceptability of oral pre-exposure prophylaxis (PrEP) and treatment as prevention (TasP) for HIV prevention amongst potential users. We consider with whom, where and in what context this research has been conducted, how acceptability has been approached, and what research gaps remain. Findings from 33 studies show a lack of TasP research, PrEP studies which have focused largely on men who have sex with men (MSM) in a US context, and varied measures of acceptability. In order to identify when, where and for whom PrEP and TasP would be most appropriate and effective, research is needed in five areas: acceptability of TasP to people living with HIV; motivation for PrEP use and adherence; current perceptions and management of risk; the impact of broader social and structural factors; and consistent definition and operationalisation of acceptability which moves beyond adherence

Crossref

Springer - Publisher Connector

PubMed Central

Enlighten

Expression profiling to predict outcome in breast cancer: the influence of sample selection

Author: AA Alizadeh
Carsten Peterson
CM Perou
J Khan
J Khan
LJ van't Veer
M Bittner
M West
Markus Ringnér
Mårten Fernö
Patrik Edén
Paul S Meltzer
S Gruvberger
Sofia K Gruvberger
SV Allander
T Sorlie
TR Golub
Åke Borg
Publication venue: BioMed Central
Publication date: 11/10/2002
Field of study

Gene expression profiling of tumors using DNA microarrays is a promising method for predicting prognosis and treatment response in cancer patients. It was recently reported that expression profiles of sporadic breast cancers could be used to predict disease recurrence better than currently available clinical and histopathological prognostic factors. Having observed an overlap in those data between the genes that predict outcome and those that predict estrogen receptor-α status, we examined their predictive power in an independent data set. We conclude that it may be important to define prognostic expression profiles separately for estrogen receptor-α-positive and estrogen receptor-α-negative tumors

Lund University Publications

Crossref

PubMed Central

Nearest Template Prediction: A Single-Sample-Based Flexible Class Prediction with Confidence Assessment

Author: A Dupuy
A Subramanian
AI Su
B Weigelt
C Desmedt
E Bair
E Wurmbach
EE Ntzani
J Fan
J Lamb
J Pittman
K Stegmaier
L Xu
LJ van 't Veer
LJ van't Veer
M Reich
M West
M Zervakis
MJ van de Vijver
Patrick Tan
S Michiels
TR Golub
Y Benjamini
Y Hoshida
Y Hoshida
Y Hoshida
Y Wang
Yujin Hoshida
Publication venue: Public Library of Science
Publication date
Field of study

Gene-expression signature-based disease classification and clinical outcome prediction has not been widely introduced in clinical medicine as initially expected, mainly due to the lack of extensive validation needed for its clinical deployment. Obstacles include variable measurement in microarray assay, inconsistent assay platform, analytical requirement for comparable pair of training and test datasets, etc. Furthermore, as medical device helping clinical decision making, the prediction needs to be made for each single patient with a measure of its reliability. To address these issues, there is a need for flexible prediction method less sensitive to difference in experimental and analytical conditions, applicable to each single patient, and providing measure of prediction confidence. The nearest template prediction (NTP) method provides a convenient way to make class prediction with assessment of prediction confidence computed in each single patient's gene-expression data using only a list of signature genes and a test dataset. We demonstrate that the method can be flexibly applied to cross-platform, cross-species, and multiclass predictions without any optimization of analysis parameters

Crossref

Directory of Open Access Journals

PubMed Central

Optimally splitting cases for training and testing high dimensional classifiers

Author: A Dupuy
A Rosenwald
AM Molinaro
B Efron
C Ambroise
J Schafer
JM Boer
K Fukunaga
K Shedden
Kevin K Dobbin
KI Kim
KK Dobbin
KK Dobbin
L Devroye
L Sun
LJ van't Veer
MD Radmacher
O Ledoit
R Simon
Richard M Simon
RO Duda
S Mukherjee
TR Golub
WJ Fu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background We consider the problem of designing a study to develop a predictive classifier from high dimensional data. A common study design is to split the sample into a training set and an independent test set, where the former is used to develop the classifier and the latter to evaluate its performance. In this paper we address the question of what proportion of the samples should be devoted to the training set. How does this proportion impact the mean squared error (MSE) of the prediction accuracy estimate? Results We develop a non-parametric algorithm for determining an optimal splitting proportion that can be applied with a specific dataset and classifier algorithm. We also perform a broad simulation study for the purpose of better understanding the factors that determine the best split proportions and to evaluate commonly used splitting strategies (1/2 training or 2/3 training) under a wide variety of conditions. These methods are based on a decomposition of the MSE into three intuitive component parts. Conclusions By applying these approaches to a number of synthetic and real microarray datasets we show that for linear classifiers the optimal proportion depends on the overall number of samples available and the degree of differential expression between the classes. The optimal proportion was found to depend on the full dataset size (n) and classification accuracy - with higher accuracy and smaller <it>n </it>resulting in more assigned to the training set. The commonly used strategy of allocating 2/3rd of cases for training was close to optimal for reasonable sized datasets (<it>n </it>≥ 100) with strong signals (i.e. 85% or greater full dataset accuracy). In general, we recommend use of our nonparametric resampling approach for determing the optimal split. This approach can be applied to any dataset, using any predictor development method, to determine the best split.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Indirect two-sided relative ranking: a robust similarity measure for gene expression data

Author: CM Perou
DE Arking
DE Martin
E Chávez
E Hubbell
ER DeLong
G Natsoulis
G Wei
GJ Kaspers
GJ Kaspers
IM Chakravarti
J Lamb
J Lamb
J Lu
JL DeRisi
KP Seiler
Lise Getoor
LJ van't Veer
Louis Licamele
OG Troyanskaya
R Pieters
SL Pomeroy
T Hongo
TR Golub
W Liu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background There is a large amount of gene expression data that exists in the public domain. This data has been generated under a variety of experimental conditions. Unfortunately, these experimental variations have generally prevented researchers from accurately comparing and combining this wealth of data, which still hides many novel insights. Results In this paper we present a new method, which we refer to as indirect two-sided relative ranking, for comparing gene expression profiles that is robust to variations in experimental conditions. This method extends the current best approach, which is based on comparing the correlations of the up and down regulated genes, by introducing a comparison based on the correlations in rankings across the entire database. Because our method is robust to experimental variations, it allows a greater variety of gene expression data to be combined, which, as we show, leads to richer scientific discoveries. Conclusions We demonstrate the benefit of our proposed indirect method on several datasets. We first evaluate the ability of the indirect method to retrieve compounds with similar therapeutic effects across known experimental barriers, namely vehicle and batch effects, on two independent datasets (one private and one public). We show that our indirect method is able to significantly improve upon the previous state-of-the-art method with a substantial improvement in recall at rank 10 of 97.03% and 49.44%, on each dataset, respectively. Next, we demonstrate that our indirect method results in improved accuracy for classification in several additional datasets. These datasets demonstrate the use of our indirect method for classifying cancer subtypes, predicting drug sensitivity/resistance, and classifying (related) cell types. Even in the absence of a known (i.e., labeled) experimental barrier, the improvement of the indirect method in each of these datasets is statistically significant.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Repository at the University of Maryland

Phenotype Prediction Using Regularized Regression on Genetic Data in the DREAM5 Systems Genetics B Challenge

Author: A de la Fuente
A Roses
AA Alizadeh
AD Weston
BJ Chen
Bonnie Berger
EE Schadt
EE Schadt
George Tucker
GM Furnival
H Zou
I Ruczinski
J Friedman
L Zhou
LJ van 't Veer
M West
Mark Isalan
MV Rockman
Po-Ru Loh
R Tibshirani
RB Brem
RJ Prill
TR Golub
V Emilsson
Y Benjamini
Y Chen
Z Kutalik
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

A major goal of large-scale genomics projects is to enable the use of data from high-throughput experimental methods to predict complex phenotypes such as disease susceptibility. The DREAM5 Systems Genetics B Challenge solicited algorithms to predict soybean plant resistance to the pathogen Phytophthora sojae from training sets including phenotype, genotype, and gene expression data. The challenge test set was divided into three subcategories, one requiring prediction based on only genotype data, another on only gene expression data, and the third on both genotype and gene expression data. Here we present our approach, primarily using regularized regression, which received the best-performer award for subchallenge B2 (gene expression only). We found that despite the availability of 941 genotype markers and 28,395 gene expression features, optimal models determined by cross-validation experiments typically used fewer than ten predictors, underscoring the importance of strong regularization in noisy datasets with far more features than samples. We also present substantial analysis of the training and test setup of the challenge, identifying high variance in performance on the gold standard test sets.National Science Foundation (U.S.). Graduate Research Fellowship ProgramNational Defense Science and Engineering Graduate Fellowshi

Public Library of Science (PLOS)

CiteSeerX

DSpace@MIT

Crossref

Directory of Open Access Journals

PubMed Central

Testing the additional predictive value of high-dimensional molecular data

Author: AL Boulesteix
AL Boulesteix
Anne-Laure Boulesteix
C Truntzer
G Tutz
H Binder
H Höing
J Fridlyand
J Friedman
J Goeman
JJ Goeman
JJ Goeman
LJ van't Veer
M Schmidberger
O Gevaert
P Bühlmann
P Eden
R Tibshirani
R Tibshirani
S Chiaretti
T Golub
T Hothorn
T Hothorn
Torsten Hothorn
X Li
Y Freund
Y Sun
Publication venue: BioMed Central
Publication date: 01/09/2009
Field of study

While high-dimensional molecular data such as microarray gene expression data have been used for disease outcome prediction or diagnosis purposes for about ten years in biomedical research, the question of the additional predictive value of such data given that classical predictors are already available has long been under-considered in the bioinformatics literature. We suggest an intuitive permutation-based testing procedure for assessing the additional predictive value of high-dimensional molecular data. Our method combines two well-known statistical tools: logistic regression and boosting regression. We give clear advice for the choice of the only method parameter (the number of boosting iterations). In simulations, our novel approach is found to have very good power in different settings, e.g. few strong predictors or many weak predictors. For illustrative purpose, it is applied to two publicly available cancer data sets. Our simple and computationally efficient approach can be used to globally assess the additional predictive power of a large number of candidate predictors given that a few clinical covariates or a known prognostic index are already available

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Open Access LMU

Top scoring pairs for feature selection in machine learning and applications to cancer outcome prediction

Author: A Statnikov
AC Tan
C Bishop
C Lai
D Geman
DG Beer
I Guyon
I Inza
J Jin
J Weston
LJ van 't Veer
Mark A Kon
MH Asyali
P Baldi
Ping Shi
Qifu Zhu
R Blanco
R Kohavi
S Hanshall
S Ma
S Yoon
SL Pomeroy
Surajit Ray
TM Cover
TR Golub
TS Furey
V Vinaya
VN Vapnik
X Zhang
Y Saeys
Y Wang
Y Wang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background The widely used k top scoring pair (k-TSP) algorithm is a simple yet powerful parameter-free classifier. It owes its success in many cancer microarray datasets to an effective feature selection algorithm that is based on relative expression ordering of gene pairs. However, its general robustness does not extend to some difficult datasets, such as those involving cancer outcome prediction, which may be due to the relatively simple voting scheme used by the classifier. We believe that the performance can be enhanced by separating its effective feature selection component and combining it with a powerful classifier such as the support vector machine (SVM). More generally the top scoring pairs generated by the k-TSP ranking algorithm can be used as a dimensionally reduced subspace for other machine learning classifiers. Results We developed an approach integrating the k-TSP ranking algorithm (TSP) with other machine learning methods, allowing combination of the computationally efficient, multivariate feature ranking of k-TSP with multivariate classifiers such as SVM. We evaluated this hybrid scheme (k-TSP+SVM) in a range of simulated datasets with known data structures. As compared with other feature selection methods, such as a univariate method similar to Fisher's discriminant criterion (Fisher), or a recursive feature elimination embedded in SVM (RFE), TSP is increasingly more effective than the other two methods as the informative genes become progressively more correlated, which is demonstrated both in terms of the classification performance and the ability to recover true informative genes. We also applied this hybrid scheme to four cancer prognosis datasets, in which k-TSP+SVM outperforms k-TSP classifier in all datasets, and achieves either comparable or superior performance to that using SVM alone. In concurrence with what is observed in simulation, TSP appears to be a better feature selector than Fisher and RFE in some of the cancer datasets. Conclusions The k-TSP ranking algorithm can be used as a computationally efficient, multivariate filter method for feature selection in machine learning. SVM in combination with k-TSP ranking algorithm outperforms k-TSP and SVM alone in simulated datasets and in some cancer prognosis datasets. Simulation studies suggest that as a feature selector, it is better tuned to certain data characteristics, i.e. correlations among informative genes, which is potentially interesting as an alternative feature ranking method in pathway analysis

CiteSeerX

Crossref

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Enlighten

On reliable discovery of molecular signatures

Author: A Heorl
B Efron
C Cortes
D Singh
F Li
I Guyon
I Guyon
J Bogaerts
J Schäfer
Jesper Tegnér
Johan Björkegren
JP Ioannidis
L Devroye
L Ein-Dor
L Ein-Dor
LJ van't Veer
M Campo Dell'Orto
ME Tipping
R Nilsson
R Nilsson
R Nilsson
Roland Nilsson
S Michiels
S Mika
TM Frayling
TR Golub
U Alon
VN Vapnik
Y Benjamini
Y Wang
Y Yu
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Molecular signatures are sets of genes, proteins, genetic variants or other variables that can be used as markers for a particular phenotype. Reliable signature discovery methods could yield valuable insight into cell biology and mechanisms of human disease. However, it is currently not clear how to control error rates such as the false discovery rate (FDR) in signature discovery. Moreover, signatures for cancer gene expression have been shown to be unstable, that is, difficult to replicate in independent studies, casting doubts on their reliability. Results We demonstrate that with modern prediction methods, signatures that yield accurate predictions may still have a high FDR. Further, we show that even signatures with low FDR may fail to replicate in independent studies due to limited statistical power. Thus, neither stability nor predictive accuracy are relevant when FDR control is the primary goal. We therefore develop a general statistical hypothesis testing framework that for the first time provides FDR control for signature discovery. Our method is demonstrated to be correct in simulation studies. When applied to five cancer data sets, the method was able to discover molecular signatures with 5% FDR in three cases, while two data sets yielded no significant findings. Conclusion Our approach enables reliable discovery of molecular signatures from genome-wide data with current sample sizes. The statistical framework developed herein is potentially applicable to a wide range of prediction problems in bioinformatics.</p

Publikationer från Linköpings universitet

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line

A boosting method for maximizing the partial area under the ROC curve

Author: A Ben-Dor
BG Lugosi
D Bamber
G Tutz
J Friedman
J Neyman
L Hadjiiski
LE Dodd
LJ van't Veer
M Dettling
MJ Pencina
MS Pepe
MS Pepe
MS Pepe
MS Pepe
MW McIntosh
N Murata
NR Cook
O Komori
Osamu Komori
P Bühlmann
P Zhao
S Eguchi
S Ma
SG Baker
Shinto Eguchi
T Cai
TT Golub
Y Freund
Y Qi
Z Wang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The receiver operating characteristic (ROC) curve is a fundamental tool to assess the discriminant performance for not only a single marker but also a score function combining multiple markers. The area under the ROC curve (AUC) for a score function measures the intrinsic ability for the score function to discriminate between the controls and cases. Recently, the partial AUC (pAUC) has been paid more attention than the AUC, because a suitable range of the false positive rate can be focused according to various clinical situations. However, existing pAUC-based methods only handle a few markers and do not take nonlinear combination of markers into consideration. Results We have developed a new statistical method that focuses on the pAUC based on a boosting technique. The markers are combined componentially for maximizing the pAUC in the boosting algorithm using natural cubic splines or decision stumps (single-level decision trees), according to the values of markers (continuous or discrete). We show that the resulting score plots are useful for understanding how each marker is associated with the outcome variable. We compare the performance of the proposed boosting method with those of other existing methods, and demonstrate the utility using real data sets. As a result, we have much better discrimination performances in the sense of the pAUC in both simulation studies and real data analysis. Conclusions The proposed method addresses how to combine the markers after a pAUC-based filtering procedure in high dimensional setting. Hence, it provides a consistent way of analyzing data based on the pAUC from maker selection to marker combination for discrimination problems. The method can capture not only linear but also nonlinear association between the outcome variable and the markers, about which the nonlinearity is known to be necessary in general for the maximization of the pAUC. The method also puts importance on the accuracy of classification performance as well as interpretability of the association, by offering simple and smooth resultant score plots for each marker.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central