1,308 research outputs found

    A hierarchical Bayesian model for inference of copy number variants and their association to gene expression

    Get PDF
    A number of statistical models have been successfully developed for the analysis of high-throughput data from a single source, but few methods are available for integrating data from different sources. Here we focus on integrating gene expression levels with comparative genomic hybridization (CGH) array measurements collected on the same subjects. We specify a measurement error model that relates the gene expression levels to latent copy number states which, in turn, are related to the observed surrogate CGH measurements via a hidden Markov model. We employ selection priors that exploit the dependencies across adjacent copy number states and investigate MCMC stochastic search techniques for posterior inference. Our approach results in a unified modeling framework for simultaneously inferring copy number variants (CNV) and identifying their significant associations with mRNA transcripts abundance. We show performance on simulated data and illustrate an application to data from a genomic study on human cancer cell lines.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS705 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Modeling Belief in Dynamic Systems, Part II: Revision and Update

    Full text link
    The study of belief change has been an active area in philosophy and AI. In recent years two special cases of belief change, belief revision and belief update, have been studied in detail. In a companion paper (Friedman & Halpern, 1997), we introduce a new framework to model belief change. This framework combines temporal and epistemic modalities with a notion of plausibility, allowing us to examine the change of beliefs over time. In this paper, we show how belief revision and belief update can be captured in our framework. This allows us to compare the assumptions made by each method, and to better understand the principles underlying them. In particular, it shows that Katsuno and Mendelzon's notion of belief update (Katsuno & Mendelzon, 1991a) depends on several strong assumptions that may limit its applicability in artificial intelligence. Finally, our analysis allow us to identify a notion of minimal change that underlies a broad range of belief change operations including revision and update.Comment: See http://www.jair.org/ for other files accompanying this articl

    Time to reject the privileging of economic theory over empirical evidence? A Reply to Lawson (2009)

    Get PDF
    The present financial and economic crisis has revealed a systemic failure of academic economics and emphasized the need to re-think how to model economic phenomena. Lawson (2009) seems concerned that critics of standard models now will fill academic journals with contributions that make the same methodological mistakes, albeit in slightly different guise. In particular, he is rather sceptical to use of mathematical statistical models, such as the CVAR approach, as a way of learning about economic mechanisms. In this paper I discuss whether this is a relevant claim and argue that it is likely to be based on a misunderstanding of what a proper statistical analysis is and can offer. In particular, I argue that the strong evidence of (near) unit roots and (structural) breaks in economic variables suggests that standard economic models need to be modified or changed to incorporate these strong features of the data. Furthermore, I argue that a strong empirical methodology that allows data to speak freely about economic mechanisms, such as the CVAR, would ensure that important information in the data is not over heard when needed. Adequately applied such models would provide us with an early warnings system signalling that the economy is moving seriously out of equilibrium.economic crisis; Dahlem report; CVAR approach; Theory-first; Reality-first; Imperfect Knowledge Expectations; non-stationary data

    Joint state-parameter estimation of a nonlinear stochastic energy balance model from sparse noisy data

    Get PDF
    While nonlinear stochastic partial differential equations arise naturally in spatiotemporal modeling, inference for such systems often faces two major challenges: sparse noisy data and ill-posedness of the inverse problem of parameter estimation. To overcome the challenges, we introduce a strongly regularized posterior by normalizing the likelihood and by imposing physical constraints through priors of the parameters and states. We investigate joint parameter-state estimation by the regularized posterior in a physically motivated nonlinear stochastic energy balance model (SEBM) for paleoclimate reconstruction. The high-dimensional posterior is sampled by a particle Gibbs sampler that combines MCMC with an optimal particle filter exploiting the structure of the SEBM. In tests using either Gaussian or uniform priors based on the physical range of parameters, the regularized posteriors overcome the ill-posedness and lead to samples within physical ranges, quantifying the uncertainty in estimation. Due to the ill-posedness and the regularization, the posterior of parameters presents a relatively large uncertainty, and consequently, the maximum of the posterior, which is the minimizer in a variational approach, can have a large variation. In contrast, the posterior of states generally concentrates near the truth, substantially filtering out observation noise and reducing uncertainty in the unconstrained SEBM

    Missing Data and Variable Selection Methods for Cure Models in Cancer Research

    Full text link
    In survival analysis, a common assumption is that all subjects will eventually experience the event of interest given long enough follow-up time. However, there are many settings in which this assumption does not hold. For example, suppose we are interested in studying cancer recurrence. If the treatment eradicated the cancer for some patients, then there will be a subset of the population that will never experience a recurrence. We call these subjects “cured.” The Cox proportional hazards (CPH) mixture cure model and a generalization, the multistate cure model, can be used to model time-to-event outcomes in the cure setting. In this dissertation, we will address issues of missing data, variable selection, and parameter estimation for these models. We will also explore issues of missing covariate and outcome data for a more general class of models, of which cure models are a particular case. In Chapter II, we propose several chained equations methods for imputing missing covariates under the CPH mixture cure model, and we compare the novel approaches with existing chained equations methods for imputing survival data without a cured fraction. In Chapter III, we develop sequential imputation methods for a general class of models with latent and partially latent variables (of which cure models are an example). In particular, we consider the setting where covariate/outcome missingness depends on the latent variable, which is a missing not at random mechanism. In Chapter IV, we develop an EM algorithm for fitting the multistate cure model. The existing method for fitting this model requires custom software and can be slow to converge. In contrast, the proposed method can be easily implemented using standard software and typically converges quickly. We further propose a Monte Carlo EM algorithm for fitting the multistate cure model in the presence of covariate missingness and/or unequal censoring of the outcomes. In Chapter V, we propose a generalization of the multistate cure model to incorporate subjects with persistent disease. This model has many parameters, and variable selection/shrinkage methods are needed to aid in estimation. We compare the performance of existing variable selection/shrinkage methods in estimating model parameters for a study of head and neck cancer. In Chapter VI, we develop Bayesian methods for performing variable selection when we have order restrictions for model parameters. In particular, we consider the setting in which we have interactions with one or more order-restricted variables. A simulation study demonstrates promising properties of the proposed selection method.PHDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144010/1/lbeesley_1.pd

    Ambiguity and credence quality: Implications for technology adoption

    Get PDF
    Use of fertilizer and hybrid seed remains low in much of Sub-Saharan Africa. A possible contributor to low adoption is that farmers are uncertain about the quality of agricultural inputs available to them. While previous studies have shown that risk and uncertainty preferences are relevant to the decision to adopt a technology, existing research assumes that farmers have homogeneous beliefs about the quality of available inputs. I test this assumption using an incentivized Becker-DeGroot-Marschack auction in Tanzania and examine how farmer beliefs about mineral fertilizer quality in local markets influence their willingness-to-pay. I find that farmers are willing to pay 46% more for fertilizer that was laboratory tested and found to be pure than for untested fertilizer. Farmers who believe that more of the fertilizer for sale in their local market is low in quality are willing to pay a higher premium for laboratory-tested pure quality fertilizer, compared to untested fertilizer. Yet these results present something of a puzzle, given that three rounds of testing of fertilizer for sale in regional markets over five years have demonstrated that the nutrient content of fertilizer for sale in these contexts is consistently at or near advertised levels. Farmers appear to believe that low-quality fertilizer is far more prevalent in proximate markets than it actually is. How have farmers’ incorrect beliefs persisted in equilibrium? I posit two interconnected mechanisms. First, misattribution: Yields are stochastic due to weather and other factors, and when a yield in a particular year is unusually low, farmers misattribute noise as indicative of low-quality fertilizer. Second, farmers experience both risk (uncertainty about whether a bag of fertilizer is bad) and ambiguity (uncertainty about the likelihood a bag of fertilizer is bad), and thus hold multiple priors. I develop a Bayesian learning model that incorporates both misattribution and multiple priors and show that in equilibrium beliefs do not converge to the truth. Supporting the model's findings, I use farmer survey data from Uganda to establish that historic precipitation variability relates to farmers’ fertilizer quality belief distributions. I use the learning model to simulate several policy interventions, and show that subsidies, information campaigns, and plot-specific fertilizer recommendations improve beliefs, but do not cause beliefs to fully converge to the truth. Instead, policy makers should consider programs that address the misattribution problem

    Should we screen for the sexually-transmitted infection Mycoplasma genitalium? Evidence synthesis using a transmission-dynamic model

    Get PDF
    There is increasing concern about Mycoplasma genitalium as a cause of urethritis, cervicitis, pelvic inflammatory disease (PID), infertility and ectopic pregnancy. Commercial nucleic acid amplification tests (NAATs) are becoming available, and their use in screening for M. genitalium has been advocated, but M. genitalium's natural history is poorly-understood, making screening's effectiveness unclear. We used a transmission-dynamic compartmental model to synthesise evidence from surveillance data and epidemiological and behavioural studies to better understand M. genitalium's natural history, and then examined the effects of implementing NAAT testing. Introducing NAAT testing initially increases diagnoses, by finding a larger proportion of infections; subsequently the diagnosis rate falls, due to reduced incidence. Testing only symptomatic patients finds relatively little infection in women, as a large proportion is asymptomatic. Testing both symptomatic and asymptomatic patients has a much larger impact and reduces cumulative PID incidence in women due to M. genitalium by 31.1% (95% range:13.0%-52.0%) over 20 years. However, there is important uncertainty in M. genitalium's natural history parameters, leading to uncertainty in the absolute reduction in PID and sequelae. Empirical work is required to improve understanding of key aspects of M. genitalium's natural history before it will be possible to determine the effectiveness of screening
    • …
    corecore