5,944 research outputs found

    Modeling dependent gene expression

    Full text link
    In this paper we propose a Bayesian approach for inference about dependence of high throughput gene expression. Our goals are to use prior knowledge about pathways to anchor inference about dependence among genes; to account for this dependence while making inferences about differences in mean expression across phenotypes; and to explore differences in the dependence itself across phenotypes. Useful features of the proposed approach are a model-based parsimonious representation of expression as an ordinal outcome, a novel and flexible representation of prior information on the nature of dependencies, and the use of a coherent probability model over both the structure and strength of the dependencies of interest. We evaluate our approach through simulations and in the analysis of data on expression of genes in the Complement and Coagulation Cascade pathway in ovarian cancer.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS525 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Generalized Species Sampling Priors with Latent Beta reinforcements

    Full text link
    Many popular Bayesian nonparametric priors can be characterized in terms of exchangeable species sampling sequences. However, in some applications, exchangeability may not be appropriate. We introduce a {novel and probabilistically coherent family of non-exchangeable species sampling sequences characterized by a tractable predictive probability function with weights driven by a sequence of independent Beta random variables. We compare their theoretical clustering properties with those of the Dirichlet Process and the two parameters Poisson-Dirichlet process. The proposed construction provides a complete characterization of the joint process, differently from existing work. We then propose the use of such process as prior distribution in a hierarchical Bayes modeling framework, and we describe a Markov Chain Monte Carlo sampler for posterior inference. We evaluate the performance of the prior and the robustness of the resulting inference in a simulation study, providing a comparison with popular Dirichlet Processes mixtures and Hidden Markov Models. Finally, we develop an application to the detection of chromosomal aberrations in breast cancer by leveraging array CGH data.Comment: For correspondence purposes, Edoardo M. Airoldi's email is [email protected]; Federico Bassetti's email is [email protected]; Michele Guindani's email is [email protected] ; Fabrizo Leisen's email is [email protected]. To appear in the Journal of the American Statistical Associatio

    Action classification using a discriminative non-parametric hidden Markov model

    Get PDF
    We classify human actions occurring in videos, using the skeletal joint positions extracted from a depth image sequence as features. Each action class is represented by a non-parametric Hidden Markov Model (NP-HMM) and the model parameters are learnt in a discriminative way. Specifically, we use a Bayesian framework based on Hierarchical Dirichlet Process (HDP) to automatically infer the cardinality of hidden states and formulate a discriminative function based on distance between Gaussian distributions to improve classification performance. We use elliptical slice sampling to efficiently sample parameters from the complex posterior distribution induced by our discriminative likelihood function. We illustrate our classification results for action class models trained using this technique

    Diffusive hidden Markov model characterization of DNA looping dynamics in tethered particle experiments

    Get PDF
    In many biochemical processes, proteins bound to DNA at distant sites are brought into close proximity by loops in the underlying DNA. For example, the function of some gene-regulatory proteins depends on such DNA looping interactions. We present a new technique for characterizing the kinetics of loop formation in vitro, as observed using the tethered particle method, and apply it to experimental data on looping induced by lambda repressor. Our method uses a modified (diffusive) hidden Markov analysis that directly incorporates the Brownian motion of the observed tethered bead. We compare looping lifetimes found with our method (which we find are consistent over a range of sampling frequencies) to those obtained via the traditional threshold-crossing analysis (which can vary depending on how the raw data are filtered in the time domain). Our method does not involve any time filtering and can detect sudden changes in looping behavior. For example, we show how our method can identify transitions between long-lived, kinetically distinct states that would otherwise be difficult to discern

    You can't always sketch what you want: Understanding Sensemaking in Visual Query Systems

    Full text link
    Visual query systems (VQSs) empower users to interactively search for line charts with desired visual patterns, typically specified using intuitive sketch-based interfaces. Despite decades of past work on VQSs, these efforts have not translated to adoption in practice, possibly because VQSs are largely evaluated in unrealistic lab-based settings. To remedy this gap in adoption, we collaborated with experts from three diverse domains---astronomy, genetics, and material science---via a year-long user-centered design process to develop a VQS that supports their workflow and analytical needs, and evaluate how VQSs can be used in practice. Our study results reveal that ad-hoc sketch-only querying is not as commonly used as prior work suggests, since analysts are often unable to precisely express their patterns of interest. In addition, we characterize three essential sensemaking processes supported by our enhanced VQS. We discover that participants employ all three processes, but in different proportions, depending on the analytical needs in each domain. Our findings suggest that all three sensemaking processes must be integrated in order to make future VQSs useful for a wide range of analytical inquiries.Comment: Accepted for presentation at IEEE VAST 2019, to be held October 20-25 in Vancouver, Canada. Paper will also be published in a special issue of IEEE Transactions on Visualization and Computer Graphics (TVCG) IEEE VIS (InfoVis/VAST/SciVis) 2019 ACM 2012 CCS - Human-centered computing, Visualization, Visualization design and evaluation method

    A statistical analysis of memory CD8 T cell differentiation: An application of a hierarchical state space model to a short time course microarray experiment

    Full text link
    CD8 T cells are specialized immune cells that play an important role in the regulation of antiviral immune response and the generation of protective immunity. In this paper we investigate the differentiation of memory CD8 T cells in the immune response using a short time course microarray experiment. Structurally, this experiment is similar to many in that it involves measurements taken on independent samples, in one biological group, at a small number of irregularly spaced time points, and exhibiting patterns of temporal nonstationarity. To analyze this CD8 T-cell experiment, we develop a hierarchical state space model so that we can: (1) detect temporally differentially expressed genes, (2) identify the direction of successive changes over time, and (3) assess the magnitude of successive changes over time. We incorporate hidden Markov models into our model to utilize the information embedded in the time series and set up the proposed hierarchical state space model in an empirical Bayes framework to utilize the population information from the large-scale data. Analysis of the CD8 T-cell experiment using the proposed model results in biologically meaningful findings. Temporal patterns involved in the differentiation of memory CD8 T cells are summarized separately and performance of the proposed model is illustrated in a simulation study.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS118 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Conjugate Bayes for probit regression via unified skew-normal distributions

    Full text link
    Regression models for dichotomous data are ubiquitous in statistics. Besides being useful for inference on binary responses, these methods serve also as building blocks in more complex formulations, such as density regression, nonparametric classification and graphical models. Within the Bayesian framework, inference proceeds by updating the priors for the coefficients, typically set to be Gaussians, with the likelihood induced by probit or logit regressions for the responses. In this updating, the apparent absence of a tractable posterior has motivated a variety of computational methods, including Markov Chain Monte Carlo routines and algorithms which approximate the posterior. Despite being routinely implemented, Markov Chain Monte Carlo strategies face mixing or time-inefficiency issues in large p and small n studies, whereas approximate routines fail to capture the skewness typically observed in the posterior. This article proves that the posterior distribution for the probit coefficients has a unified skew-normal kernel, under Gaussian priors. Such a novel result allows efficient Bayesian inference for a wide class of applications, especially in large p and small-to-moderate n studies where state-of-the-art computational methods face notable issues. These advances are outlined in a genetic study, and further motivate the development of a wider class of conjugate priors for probit models along with methods to obtain independent and identically distributed samples from the unified skew-normal posterior
    • …
    corecore