143 research outputs found
Genetic screening for gynecological cancer: where are we heading?
The landscape of cancer genetics in gynecological oncology is rapidly changing. The traditional family history-based approach has limitations and misses >50% mutation carriers. This is now being replaced by population-based approaches. The need for changing the clinical paradigm from family history-based to population-based BRCA1/BRCA2 testing in Ashkenazi Jews is supported by data that demonstrate population-based BRCA1/BRCA2 testing does not cause psychological harm and is cost effective. This article covers various genetic testing strategies for gynecological cancers, including population-based approaches, panel and direct-to-consumer testing as well as the need for innovative approaches to genetic counseling. Advances in genetic testing technology and computational analytics have facilitated an integrated systems medicine approach, providing increasing potential for population-based genetic testing, risk stratification, and cancer prevention. Genomic information along-with biological/computational tools will be used to deliver predictive, preventive, personalized and participatory (P4) and precision medicine in the future
A Comparison of Marginal Likelihood Computation Methods
In a Bayesian analysis, different models can be compared on the basis of theexpected or marginal likelihood they attain. Many methods have been devised to compute themarginal likelihood, but simplicity is not the strongest point of most methods. At the sametime, the precision of methods is often questionable.In this paper several methods are presented in a common framework. The explanation of thedifferences is followed by an application, in which the precision of the methods is testedon a simple regression model where a comparison with analytical results is possible
Data splitting as a countermeasure against hypothesis fishing: with a case study of predictors for low back pain
There is growing concern in the scientific community that many published scientific findings may represent spurious patterns that are not reproducible in independent data sets. A reason for this is that significance levels or confidence intervals are often applied to secondary variables or sub-samples within the trial, in addition to the primary hypotheses (multiple hypotheses). This problem is likely to be extensive for population-based surveys, in which epidemiological hypotheses are derived after seeing the data set (hypothesis fishing). We recommend a data-splitting procedure to counteract this methodological problem, in which one part of the data set is used for identifying hypotheses, and the other is used for hypothesis testing. The procedure is similar to two-stage analysis of microarray data. We illustrate the process using a real data set related to predictors of low back pain at 14-year follow-up in a population initially free of low back pain. “Widespreadness” of pain (pain reported in several other places than the low back) was a statistically significant predictor, while smoking was not, despite its strong association with low back pain in the first half of the data set. We argue that the application of data splitting, in which an independent party handles the data set, will achieve for epidemiological surveys what pre-registration has done for clinical studies
Mapping quantitative trait loci in line cross with repeat records
<p>Abstract</p> <p>Background</p> <p>Phenotypes with repeat records from one individual or multiple individuals were often encountered in practices of mapping QTL in linecross. The current genetic mapping method for a trait with repeat records is adopted by simply replacing the phenotype by the average value of the repeat records. This simple treatment has not sufficiently utilized the information from the replication and ignored the impacts of the permanent environmental effects on the accuracy of the estimated QTL.</p> <p>Results</p> <p>We propose to map QTL by using the repeatability model to directly analyze the repeat records rather than simply analyze the mean phenotype, improving the efficiency of QTL detecting because of adequately utilizing the information from data and allowing for the permanent environmental effects. A maximum likelihood method implemented via the expectation-maximization (EM) algorithm is applied to perform the parameter estimation of the repeatability model. The superiority of the mapping method based on the repeatability model over simple analysis using the mean phenotype was demonstrated by a series of simulations.</p> <p>Conclusion</p> <p>Our results suggest that the proposed method can serve as a powerful alternative to existing methods. By mean of the repeatability model, utilizing the repeat records on individual may improve the efficiency of QTL detecting in line cross.</p
Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers
<p>Abstract</p> <p>Background</p> <p>Decision curve analysis is a novel method for evaluating diagnostic tests, prediction models and molecular markers. It combines the mathematical simplicity of accuracy measures, such as sensitivity and specificity, with the clinical applicability of decision analytic approaches. Most critically, decision curve analysis can be applied directly to a data set, and does not require the sort of external data on costs, benefits and preferences typically required by traditional decision analytic techniques.</p> <p>Methods</p> <p>In this paper we present several extensions to decision curve analysis including correction for overfit, confidence intervals, application to censored data (including competing risk) and calculation of decision curves directly from predicted probabilities. All of these extensions are based on straightforward methods that have previously been described in the literature for application to analogous statistical techniques.</p> <p>Results</p> <p>Simulation studies showed that repeated 10-fold crossvalidation provided the best method for correcting a decision curve for overfit. The method for applying decision curves to censored data had little bias and coverage was excellent; for competing risk, decision curves were appropriately affected by the incidence of the competing risk and the association between the competing risk and the predictor of interest. Calculation of decision curves directly from predicted probabilities led to a smoothing of the decision curve.</p> <p>Conclusion</p> <p>Decision curve analysis can be easily extended to many of the applications common to performance measures for prediction models. Software to implement decision curve analysis is provided.</p
The in- or exclusion of non-breast cancer related death and contralateral breast cancer significantly affects estimated outcome probability in early breast cancer
A wide variation of definitions of recurrent disease and survival are used in the analyses of outcome of patients with early breast cancer. Explicit definitions with details both on endpoints and censoring are provided in less than half of published studies. We evaluated the effects of various definitions of survival and recurrent disease on estimated outcome in a prospectively determined cohort of 463 patients with primary breast cancer. Outcome estimates were determined both by the Kaplan–Meier and a competing risk method. In- or exclusion of contralateral breast cancer or non-disease related death in the definition of recurrent disease or survival significantly affects estimated outcome probability. The magnitude of this finding was dependent on patient-, tumour-, and treatment characteristics. Knowledge of the contribution of non-disease related death or contralateral breast cancer to estimated recurrent disease rate and overall death rate is indispensable for a correct interpretation and comparison of outcome analyses
Calibrating the Performance of SNP Arrays for Whole-Genome Association Studies
To facilitate whole-genome association studies (WGAS), several high-density SNP genotyping arrays have been developed. Genetic coverage and statistical power are the primary benchmark metrics in evaluating the performance of SNP arrays. Ideally, such evaluations would be done on a SNP set and a cohort of individuals that are both independently sampled from the original SNPs and individuals used in developing the arrays. Without utilization of an independent test set, previous estimates of genetic coverage and statistical power may be subject to an overfitting bias. Additionally, the SNP arrays' statistical power in WGAS has not been systematically assessed on real traits. One robust setting for doing so is to evaluate statistical power on thousands of traits measured from a single set of individuals. In this study, 359 newly sampled Americans of European descent were genotyped using both Affymetrix 500K (Affx500K) and Illumina 650Y (Ilmn650K) SNP arrays. From these data, we were able to obtain estimates of genetic coverage, which are robust to overfitting, by constructing an independent test set from among these genotypes and individuals. Furthermore, we collected liver tissue RNA from the participants and profiled these samples on a comprehensive gene expression microarray. The RNA levels were used as a large-scale set of quantitative traits to calibrate the relative statistical power of the commercial arrays. Our genetic coverage estimates are lower than previous reports, providing evidence that previous estimates may be inflated due to overfitting. The Ilmn650K platform showed reasonable power (50% or greater) to detect SNPs associated with quantitative traits when the signal-to-noise ratio (SNR) is greater than or equal to 0.5 and the causal SNP's minor allele frequency (MAF) is greater than or equal to 20% (N = 359). In testing each of the more than 40,000 gene expression traits for association to each of the SNPs on the Ilmn650K and Affx500K arrays, we found that the Ilmn650K yielded 15% times more discoveries than the Affx500K at the same false discovery rate (FDR) level
- …