7 research outputs found

    Triage of high-risk HPV-positive women in population-based screening by miRNA expression analysis in cervical scrapes; a feasibility study

    Get PDF
    Background: Primary testing for high-risk HPV (hrHPV) is increasingly implemented in cervical cancer screening programs. Many hrHPV-positive women, however, harbor clinically irrelevant infections, demanding additional disease markers to prevent over-referral and over-treatment. Most promising biomarkers reflect molecular events relevant to the disease process that can be measured objectively in small amounts of clinical material, such as miRNAs. We previously identified eight miRNAs with altered expression in cervical precancer and cancer due to either methylation-mediated silencing or chromosomal alterations. In this study, we evaluated the clinical value of these eight miRNAs on cervical scrapes to triage hrHPV-positive women in cervical screening. Results: Expression levels of the eight candidate miRNAs in cervical tissue samples (n =

    Factors affecting the accuracy of a class prediction model in gene expression data

    Get PDF
    BACKGROUND: Class prediction models have been shown to have varying performances in clinical gene expression datasets. Previous evaluation studies, mostly done in the field of cancer, showed that the accuracy of class prediction models differs from dataset to dataset and depends on the type of classification function. While a substantial amount of information is known about the characteristics of classification functions, little has been done to determine which characteristics of gene expression data have impact on the performance of a classifier. This study aims to empirically identify data characteristics that affect the predictive accuracy of classification models, outside of the field of cancer. RESULTS: Datasets from twenty five studies meeting predefined inclusion and exclusion criteria were downloaded. Nine classification functions were chosen, falling within the categories: discriminant analyses or Bayes classifiers, tree based, regularization and shrinkage and nearest neighbors methods. Consequently, nine class prediction models were built for each dataset using the same procedure and their performances were evaluated by calculating their accuracies. The characteristics of each experiment were recorded, (i.e., observed disease, medical question, tissue/cell types and sample size) together with characteristics of the gene expression data, namely the number of differentially expressed genes, the fold changes and the within-class correlations. Their effects on the accuracy of a class prediction model were statistically assessed by random effects logistic regression. The number of differentially expressed genes and the average fold change had significant impact on the accuracy of a classification model and gave individual explained-variation in prediction accuracy of up to 72% and 57%, respectively. Multivariable random effects logistic regression with forward selection yielded the two aforementioned study factors and the within class correlation as factors affecting the accuracy of classification functions, explaining 91.5% of the between study variation. CONCLUSIONS: We evaluated study- and data-related factors that might explain the varying performances of classification functions in non-cancerous datasets. Our results showed that the number of differentially expressed genes, the fold change, and the correlation in gene expression data significantly affect the accuracy of class prediction models
    corecore