Search CORE

1,039 research outputs found

Mammography and beyond: developing technologies for the early detection of breast cancer

Author: Eugenio Paci
MS Pepe
Publication venue: BioMed Central
Publication date: 28/03/2002
Field of study

Crossref

PubMed Central

Evaluating classification accuracy for modern learning approaches

Author: Cox DR
Li J
Liaw A
Pepe MS
Schalkoff RJ
Vapnik V
Publication venue: 'Wiley'
Publication date: 15/06/2019
Field of study

Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/149333/1/sim8103_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/149333/2/sim8103.pd

Crossref

Deep Blue Documents

A Unifying Framework for Evaluating the Predictive Power of Genetic Variants Based on the Level of Heritability Explained

Author: A Ghosh
AC Aitken
AC Janssens
AC Janssens
AC Janssens
B Fisher
B Fisher
C Garner
DG Clayton
H Zhong
HC So
HC So
Hon-Cheong So
J Jakobsdottir
J Zhang
JF Yates
JH Aldrich
K Pearson
L Farrer
M Lorenz
MH Gail
MH Gail
MH Gail
MJ Pencina
MS Pepe
MS Pepe
MS Pepe
MS Pepe
MS Pepe
MW McIntosh
NR Cook
NR Wray
NR Wray
P Kraft
Pak C. Sham
PD Pharoah
Peter M. Visscher
Q Lu
S Lemeshow
S Wacholder
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

An increasing number of genetic variants have been identified for many complex diseases. However, it is controversial whether risk prediction based on genomic profiles will be useful clinically. Appropriate statistical measures to evaluate the performance of genetic risk prediction models are required. Previous studies have mainly focused on the use of the area under the receiver operating characteristic (ROC) curve, or AUC, to judge the predictive value of genetic tests. However, AUC has its limitations and should be complemented by other measures. In this study, we develop a novel unifying statistical framework that connects a large variety of predictive indices together. We showed that, given the overall disease probability and the level of variance in total liability (or heritability) explained by the genetic variants, we can estimate analytically a large variety of prediction metrics, for example the AUC, the mean risk difference between cases and non-cases, the net reclassification improvement (ability to reclassify people into high- and low-risk categories), the proportion of cases explained by a specific percentile of population at the highest risk, the variance of predicted risks, and the risk at any percentile. We also demonstrate how to construct graphs to visualize the performance of risk models, such as the ROC curve, the density of risks, and the predictiveness curve (disease risk plotted against risk percentile). The results from simulations match very well with our theoretical estimates. Finally we apply the methodology to nine complex diseases, evaluating the predictive power of genetic tests based on known susceptibility variants for each trait

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

HKU Scholars Hub

A boosting method for maximizing the partial area under the ROC curve

Author: A Ben-Dor
BG Lugosi
D Bamber
G Tutz
J Friedman
J Neyman
L Hadjiiski
LE Dodd
LJ van't Veer
M Dettling
MJ Pencina
MS Pepe
MS Pepe
MS Pepe
MS Pepe
MW McIntosh
N Murata
NR Cook
O Komori
Osamu Komori
P Bühlmann
P Zhao
S Eguchi
S Ma
SG Baker
Shinto Eguchi
T Cai
TT Golub
Y Freund
Y Qi
Z Wang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The receiver operating characteristic (ROC) curve is a fundamental tool to assess the discriminant performance for not only a single marker but also a score function combining multiple markers. The area under the ROC curve (AUC) for a score function measures the intrinsic ability for the score function to discriminate between the controls and cases. Recently, the partial AUC (pAUC) has been paid more attention than the AUC, because a suitable range of the false positive rate can be focused according to various clinical situations. However, existing pAUC-based methods only handle a few markers and do not take nonlinear combination of markers into consideration. Results We have developed a new statistical method that focuses on the pAUC based on a boosting technique. The markers are combined componentially for maximizing the pAUC in the boosting algorithm using natural cubic splines or decision stumps (single-level decision trees), according to the values of markers (continuous or discrete). We show that the resulting score plots are useful for understanding how each marker is associated with the outcome variable. We compare the performance of the proposed boosting method with those of other existing methods, and demonstrate the utility using real data sets. As a result, we have much better discrimination performances in the sense of the pAUC in both simulation studies and real data analysis. Conclusions The proposed method addresses how to combine the markers after a pAUC-based filtering procedure in high dimensional setting. Hence, it provides a consistent way of analyzing data based on the pAUC from maker selection to marker combination for discrimination problems. The method can capture not only linear but also nonlinear association between the outcome variable and the markers, about which the nonlinearity is known to be necessary in general for the maximization of the pAUC. The method also puts importance on the accuracy of classification performance as well as interpretability of the association, by offering simple and smooth resultant score plots for each marker.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Clinical relevance of silent red blood cell autoantibodies.

Author: Alessandri C
Caputo Md
Colafigli G
Coluzzi S
De Propris Ms
Foà R
Giovannetti G
Girelli G
Guarini Ar
Mauro Fr
Pepe S
Porrazzo M
Riemma C
Trastulli F
Valesini G
Publication venue: 'Ferrata Storti Foundation (Haematologica)'
Publication date: 01/01/2017
Field of study

Archivio della ricerca- Università di Roma La Sapienza

Chapter 12: Systematic Review of Prognostic Tests

Author: AJ Vickers
BJ Ingui
Brent C. Taylor
CC Earle
CS Moskowitz
DG Altman
DG Altman
E Graf
E Ingelsson
FG Fowkes
H Janes
H Janes
H Janes
HF Groenveld
JA Hayden
JB Meigs
JG Pigeon
JH Ware
KB Dear
LM McShane
LR Arends
MA Hlatky
MJ Pencina
MJ Pencina
MJ Pencina
MK Parmar
MS Pepe
MS Pepe
MS Pepe
MS Pepe
MS Pepe
NR Cook
NR Cook
O Riesterer
P Royston
PA Hall
PA Kyzas
PA Kyzas
RD Riley
RM Poses
SE Coplen
SM Grundy
T Sinuff
The Fibrinogen Studies Collaboration
Thomas S. Rector
Timothy J. Wilt
TJ Wang
W Leisenring
WJ Mackillop
Y Huang
Z Feng
Publication venue: Springer-Verlag
Publication date: 01/01/2012
Field of study

A number of new biological markers are being studied as predictors of disease or adverse medical events among those who already have a disease. Systematic reviews of this growing literature can help determine whether the available evidence supports use of a new biomarker as a prognostic test that can more accurately place patients into different prognostic groups to improve treatment decisions and the accuracy of outcome predictions. Exemplary reviews of prognostic tests are not widely available, and the methods used to review diagnostic tests do not necessarily address the most important questions about prognostic tests that are used to predict the time-dependent likelihood of future patient outcomes. We provide suggestions for those interested in conducting systematic reviews of a prognostic test. The proposed use of the prognostic test should serve as the framework for a systematic review and to help define the key questions. The outcome probabilities or level of risk and other characteristics of prognostic groups are the most salient statistics for review and perhaps meta-analysis. Reclassification tables can help determine how a prognostic test affects the classification of patients into different prognostic groups, hence their treatment. Review of studies of the association between a potential prognostic test and patient outcomes would have little impact other than to determine whether further development as a prognostic test might be warranted

Crossref

Springer - Publisher Connector

PubMed Central

A novel application of quantile regression for identification of biomarkers exemplified by equine cartilage microarray data

Author: Arne C Bathke
Arnold J Stromberg
AV Loguinov
Christopher P Saunders
H Wang
H Wang
James N MacLeod
Liping Huang
M Schena
Mai Zhou
MS Pepe
MS Pepe
R Koenker
R Koenker
RA Vaishnav
S Dudoit
SK Shevade
W Chu
Wenying Zhu
YH Yang
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Identification of biomarkers among thousands of genes arrayed for disease classification has been the subject of considerable research in recent years. These studies have focused on disease classification, comparing experimental groups of effected to normal patients. Related experiments can be done to identify tissue-restricted biomarkers, genes with a high level of expression in one tissue compared to other tissue types in the body. Results In this study, cartilage was compared with ten other body tissues using a two color array experimental design. Thirty-seven probe sets were identified as cartilage biomarkers. Of these, 13 (35%) have existing annotation associated with cartilage including several well-established cartilage biomarkers. These genes comprise a useful database from which novel targets for cartilage biology research can be selected. We determined cartilage specific Z-scores based on the observed M to classify genes with Z-scores ≥ 1.96 in all ten cartilage/tissue comparisons as cartilage-specific genes. Conclusion Quantile regression is a promising method for the analysis of two color array experiments that compare multiple samples in the absence of biological replicates, thereby limiting quantifiable error. We used a nonparametric approach to reveal the relationship between percentiles of M and A, where M is log2(R/G) and A is 0.5 log2(RG) with R representing the gene expression level in cartilage and G representing the gene expression level in one of the other 10 tissues. Then we performed linear quantile regression to identify genes with a cartilage-restricted pattern of expression.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Kentucky

Optimizing the diagnostic power with gastric emptying scintigraphy at multiple time points

Author: AH Maurer
AY Liu
B Reiser
Byron J Gajewski
EV Garcia
F Fernandez-Matrid
G Tougas
GA Griffith
GEP Box
J Pittman
JA Hanley
JA Hanley
JP Guo
JQ Su
LP Andreas
M Camilleri
M Camilleri
Matthew S Mayo
MS Pepe
MS Pepe
MS Pepe
Qingjiang Hou
R version 2.8.1 (2008-12-22). Copy right©
Reginald Dusing
Richard W McCallum
S Srivastava
SAS Institute Inc
SG Baker
TL Abell
TW Anderson
WL Hasler
Y Zheng
Z Lin
Z Lin
Zhiyue Lin
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Gastric Emptying Scintigraphy (GES) at intervals over 4 hours after a standardized radio-labeled meal is commonly regarded as the gold standard for diagnosing gastroparesis. The objectives of this study were: 1) to investigate the best time point and the best combination of multiple time points for diagnosing gastroparesis with repeated GES measures, and 2) to contrast and cross-validate Fisher's Linear Discriminant Analysis (LDA), a rank based Distribution Free (DF) approach, and the Classification And Regression Tree (CART) model. Methods A total of 320 patients with GES measures at 1, 2, 3, and 4 hour (h) after a standard meal using a standardized method were retrospectively collected. Area under the Receiver Operating Characteristic (ROC) curve and the rate of false classification through jackknife cross-validation were used for model comparison. Results Due to strong correlation and an abnormality in data distribution, no substantial improvement in diagnostic power was found with the best linear combination by LDA approach even with data transformation. With DF method, the linear combination of 4-h and 3-h increased the Area Under the Curve (AUC) and decreased the number of false classifications (0.87; 15.0%) over individual time points (0.83, 0.82; 15.6%, 25.3%, for 4-h and 3-h, respectively) at a higher sensitivity level (sensitivity = 0.9). The CART model using 4 hourly GES measurements along with patient's age was the most accurate diagnostic tool (AUC = 0.88, false classification = 13.8%). Patients having a 4-h gastric retention value >10% were 5 times more likely to have gastroparesis (179/207 = 86.5%) than those with ≤10% (18/113 = 15.9%). Conclusions With a mixed group of patients either referred with suspected gastroparesis or investigated for other reasons, the CART model is more robust than the LDA and DF approaches, capable of accommodating covariate effects and can be generalized for cross institutional applications, but could be unstable if sample size is limited.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Statistical methods to correct for verification bias in diagnostic studies are inadequate when there are few false negatives: a simulation study

Author: A Schneider
Andrew J Vickers
Angel M Cronin
AS Bates
C Dannecker
CB Begg
IM Thompson
JA Hanley
MG Hunink
MS Lauer
MS Pepe
R Gray
RJ Little
RS Punglia
S Mallett
T Merl
TA Alonzo
TD Miller
VL Roger
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background A common feature of diagnostic research is that results for a diagnostic gold standard are available primarily for patients who are positive for the test under investigation. Data from such studies are subject to what has been termed "verification bias". We evaluated statistical methods for verification bias correction when there are few false negatives. Methods A simulation study was conducted of a screening study subject to verification bias. We compared estimates of the area-under-the-curve (AUC) corrected for verification bias varying both the rate and mechanism of verification. Results In a single simulated data set, varying false negatives from 0 to 4 led to verification bias corrected AUCs ranging from 0.550 to 0.852. Excess variation associated with low numbers of false negatives was confirmed in simulation studies and by analyses of published studies that incorporated verification bias correction. The 2.5th – 97.5th centile range constituted as much as 60% of the possible range of AUCs for some simulations. Conclusion Screening programs are designed such that there are few false negatives. Standard statistical methods for verification bias correction are inadequate in this circumstance.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Adherence to self-administered tuberculosis treatment in a high HIV-prevalence setting: a cross-sectional survey in Homa Bay, Kenya.

Good adherence to treatment is crucial to control tuberculosis (TB). Efficiency and feasibility of directly observed therapy (DOT) under routine program conditions have been questioned. As an alternative, Médecins sans Frontières introduced self-administered therapy (SAT) in several TB programs. We aimed to measure adherence to TB treatment among patients receiving TB chemotherapy with fixed dose combination (FDC) under SAT at the Homa Bay district hospital (Kenya). A second objective was to compare the adherence agreement between different assessment tools

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

MSF Field Research

Horizon / Pleins textes