28 research outputs found

    Synthesizing Data Sources to Develop and Update Risk Models

    Get PDF
    Building risk models from multiple different sources of data allows researchers to incorporate the best available information on key model parameters. In this thesis, we develop and apply methodology for optimally combining information from multiple data sources in two main contexts. In the first, motivated by the need for building subtype-specific absolute risk models for breast cancer, we develop and apply methodology for combining information information from analytic cohort or case-control studies and from population-based registries. We address the statistical challenges involved with handling different types of missing information in this context. We derive variance estimators for the risk predictions produced by such models, accounting for different sources of uncertainty. We apply the methods to two large consortia in order to build absolute risk models for overall breast cancer and for subtypes of breast cancer defined by estrogen receptor status. We show how the absolute risk models can be used to project distributions of breast cancer risk for the US population and to evaluate the potential impact of population-wide modification of breast cancer risk factors. In the second problem, we consider the issue of how to effectively incorporate external information when building new or updated risk models, again with the goal of combining data sources to produce models that are more efficient and representative of the underlying population. In particular, we explore a regression calibration approach, utilizing a method from sample-survey literature which is traditionally used for increasing the efficiency of parameter estimation from a given survey by leveraging information from external data sources. We examine the performance of the estimator in a context that has not previously been studied, where the sample and the external data are representative of different populations. We derive theoretical conditions under which the calibrated estimator produces meaningful estimates, which are calibrated to the external population, and corroborate our analytic results with numerical simulations. Our work also identified weaknesses in the methodology and promising avenues of further research in this important area

    The Non-Coding Transcriptome of Prostate Cancer: Implications for Clinical Practice

    Get PDF

    Rejoinder

    No full text

    iCARE: An R package to build, validate and apply absolute risk models.

    No full text
    This report describes an R package, called the Individualized Coherent Absolute Risk Estimator (iCARE) tool, that allows researchers to build and evaluate models for absolute risk and apply them to estimate an individual's risk of developing disease during a specified time interval based on a set of user defined input parameters. An attractive feature of the software is that it gives users flexibility to update models rapidly based on new knowledge on risk factors and tailor models to different populations by specifying three input arguments: a model for relative risk, an age-specific disease incidence rate and the distribution of risk factors for the population of interest. The tool can handle missing information on risk factors for individuals for whom risks are to be predicted using a coherent approach where all estimates are derived from a single model after appropriate model averaging. The software allows single nucleotide polymorphisms (SNPs) to be incorporated into the model using published odds ratios and allele frequencies. The validation component of the software implements the methods for evaluation of model calibration, discrimination and risk-stratification based on independent validation datasets. We provide an illustration of the utility of iCARE for building, validating and applying absolute risk models using breast cancer as an example
    corecore