72 research outputs found

    Incorporating Historical Models with Adaptive Bayesian Updates

    Get PDF
    This paper considers Bayesian approaches for incorporating information from a historical model into a current analysis when the historical model includes only a subset of covariates currently of interest. The statistical challenge is two-fold. First, the parameters in the nested historical model are not generally equal to their counterparts in the larger current model, neither in value nor interpretation. Second, because the historical information will not be equally informative for all parameters in the current analysis, additional regularization may be required beyond that provided by the historical information. We propose several novel extensions of the so-called power prior that adaptively combine a prior based upon the historical information with a variance-reducing prior that shrinks parameter values toward zero. The ideas are directly motivated by our work building mortality risk prediction models for pediatric patients receiving extracorporeal membrane oxygenation, or ECMO. We have developed a model on a registry-based cohort of ECMO patients and now seek to expand this model with additional biometric measurements, not available in the registry, collected on a small auxiliary cohort. Our adaptive priors are able to leverage the efficiency of the original model and identify novel mortality risk factors. We support this with a simulation study, which demonstrates the potential for efficiency gains in estimation under a variety of scenarios

    Inferring a consensus problem list using penalized multistage models for ordered data

    Get PDF
    A patient\u27s medical problem list describes his or her current health status and aids in the coordination and transfer of care between providers, among other things. Because a problem list is generated once and then subsequently modified or updated, what is not usually observable is the provider-effect. That is, to what extent does a patient\u27s problem in the electronic medical record actually reflect a consensus communication of that patient\u27s current health status? To that end, we report on and analyze a unique interview-based design in which multiple medical providers independently generate problem lists for each of three patient case abstracts of varying clinical difficulty. Due to the uniqueness of both our data and the scientific objectives of our analysis, we apply and extend so-called multistage models for ordered lists and equip the models with variable selection penalties to induce sparsity. Each problem has a corresponding non-negative parameter estimate, interpreted as a relative log-odds ratio, with larger values suggesting greater importance and zero values suggesting unimportant problems. We use these fitted penalized models to quantify and report the extent of consensus. For the three case abstracts, the proportions of problems with model-estimated non-zero log-odds ratios were 10/28, 16/47, and 13/30. Physicians exhibited consensus on the highest ranked problems in the first and last case abstracts but agreement quickly deteriorates; in contrast, physicians broadly disagreed on the relevant problems for the middle and most difficult case abstract

    Shrinkage Priors for Isotonic Probability Vectors and Binary Data Modeling

    Get PDF
    This paper outlines a new class of shrinkage priors for Bayesian isotonic regression modeling a binary outcome against a predictor, where the probability of the outcome is assumed to be monotonically non-decreasing with the predictor. The predictor is categorized into a large number of groups, and the set of differences between outcome probabilities in consecutive categories is equipped with a multivariate prior having support over the set of simplexes. The Dirichlet distribution, which can be derived from a normalized cumulative sum of gamma-distributed random variables, is a natural choice of prior, but using mathematical and simulation-based arguments, we show that the resulting posterior can be numerically unstable, even under simple data configurations. We propose an alternative prior motivated by horseshoe-type shrinkage that is numerically more stable. We show that this horseshoe-based prior is not subject to the numerical instability seen in the Dirichlet/gamma-based prior and that the posterior can estimate the underlying true curve more efficiently than the Dirichlet distribution. We demonstrate the use of this prior in a model predicting the occurrence of radiation-induced lung toxicity in lung cancer patients as a function of dose delivered to normal lung tissue

    Default Priors for the Intercept Parameter in Logistic Regressions

    Get PDF
    In logistic regression, separation refers to the situation in which a linear combination of predictors perfectly discriminates the binary outcome. Because finite-valued maximum likelihood parameter estimates do not exist under separation, Bayesian regressions with informative shrinkage of the regression coefficients offer a suitable alternative. Little focus has been given on whether and how to shrink the intercept parameter. Based upon classical studies of separation, we argue that efficiency in estimating regression coefficients may vary with the intercept prior. We adapt alternative prior distributions for the intercept that downweight implausibly extreme regions of the parameter space rendering less sensitivity to separation. Through simulation and the analysis of exemplar datasets, we quantify differences across priors stratified by established statistics measuring the degree of separation. Relative to diffuse priors, our recommendations generally result in more efficient estimation of the regression coefficients themselves when the data are nearly separated. They are equally efficient in non-separated datasets, making them suitable for default use. Modest differences were observed with respect to out-of-sample discrimination. Our work also highlights the interplay between priors for the intercept and the regression coefficients: numerical results are more sensitive to the choice of intercept prior when using a weakly informative prior on the regression coefficients than an informative shrinkage prior

    A modular framework for early-phase seamless oncology trials

    Get PDF
    Background: As our understanding of the etiology and mechanisms of cancer becomes more sophisticated and the number of therapeutic options increases, phase I oncology trials today have multiple primary objectives. Many such designs are now \u27seamless\u27, meaning that the trial estimates both the maximum tolerated dose and the efficacy at this dose level. Sponsors often proceed with further study only with this additional efficacy evidence. However, with this increasing complexity in trial design, it becomes challenging to articulate fundamental operating characteristics of these trials, such as (i) what is the probability that the design will identify an acceptable, i.e. safe and efficacious, dose level? or (ii) how many patients will be assigned to an acceptable dose level on average? Methods: In this manuscript, we propose a new modular framework for designing and evaluating seamless oncology trials. Each module is comprised of either a dose assignment step or a dose-response evaluation, and multiple such modules can be implemented sequentially. We develop modules from existing phase I/II designs as well as a novel module for evaluating dose-response using a Bayesian isotonic regression scheme. Results: We also demonstrate a freely available R package called seamlesssim to numerically estimate, by means of simulation, the operating characteristics of these modular trials. Conclusions: Together, this design framework and its accompanying simulator allow the clinical trialist to compare multiple different candidate designs, more rigorously assess performance, better justify sample sizes, and ultimately select a higher quality design

    A simulation study of diagnostics for bias in non-probability samples

    Get PDF
    A non-probability sampling mechanism is likely to bias estimates of parameters with respect to a target population of interest. This bias poses a unique challenge when selection is \u27non-ignorable\u27, i.e. dependent upon the unobserved outcome of interest, since it is then undetectable and thus cannot be ameliorated. We extend a simulation study by Nishimura et al. [International Statistical Review, 84, 43--62 (2016)], adding a recently published statistic, the so-called \u27standardized measure of unadjusted bias\u27, which explicitly quantifies the extent of bias under the assumption that a specified amount of non-ignorable selection exists. Our findings suggest that this new sensitivity diagnostic is considerably correlated with, and more predictive of, the true, unknown extent of selection bias than other diagnostics, even when the underlying assumed level of non-ignorability is incorrect

    Tests for Gene-Environment Interactions and Joint Effects with Exposure Misclassification

    Get PDF
    The number of methods for genome-wide testing of gene-environment interactions (GEI) continues to increase with the hope of discovering new genetic risk factors and obtaining insight into the disease-gene-environment relationship. The relative performance of these methods based on family-wise type 1 error rate and power depends on underlying disease-gene-environment associations, estimates of which may be biased in the presence of exposure misclassification. This simulation study expands on a previously published simulation study of methods for detecting GEI by evaluating the impact of exposure misclassification. We consider seven single step and modular screening methods for identifying GEI at a genome-wide level and seven joint tests for genetic association and GEI, for which the goal is to discover new genetic susceptibility loci by leveraging GEI when present. In terms of statistical power, modular methods that screen based on the marginal disease-gene relationship are more robust to exposure misclassification. Joints tests that include main/marginal effects of a gene display a similar robustness, confirming results from earlier studies. Our results offer an increased understanding of the strengths and limitations of methods for genome-wide search for GEI and joint tests in presence of exposure misclassification. KEY WORDS: case-control; genome-wide association; gene discovery, gene-environment independence; modular methods; multiple testing; screening test; weighted hypothesis test. Abbreviations: CC, case-control; CC(EXP), CC in the exposed subgroup; CO, case-only; CT, cocktail; DF, degree of freedom; D-G, disease-gene; EB, empirical Bayes; EB(EXP), EB in the exposed subgroup; EDGxE, joint marginal/association screening; FWER, family-wise error rate; G-E, gene-environment; GEI, gene-environment interaction; GEWIS, Gene Environment Wide Interaction Study; H2, hybrid two-step; LR, likelihood ratio; MA, marginal; OR, odds ratio; SE, sensitivity; SP, specificity; TS, two-step gene-environment screening

    Indices of nonâ ignorable selection bias for proportions estimated from nonâ probability samples

    Full text link
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/151805/1/rssc12371_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/151805/2/rssc12371.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/151805/3/rssc12371-sup-0001-SupInfo.pd

    Propensity score‐based diagnostics for categorical response regression models

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/102113/1/sim5940.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/102113/2/sim5940-sup-0001-supplementary.pd
    corecore