5,944 research outputs found
Modeling dependent gene expression
In this paper we propose a Bayesian approach for inference about dependence
of high throughput gene expression. Our goals are to use prior knowledge about
pathways to anchor inference about dependence among genes; to account for this
dependence while making inferences about differences in mean expression across
phenotypes; and to explore differences in the dependence itself across
phenotypes. Useful features of the proposed approach are a model-based
parsimonious representation of expression as an ordinal outcome, a novel and
flexible representation of prior information on the nature of dependencies, and
the use of a coherent probability model over both the structure and strength of
the dependencies of interest. We evaluate our approach through simulations and
in the analysis of data on expression of genes in the Complement and
Coagulation Cascade pathway in ovarian cancer.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS525 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Generalized Species Sampling Priors with Latent Beta reinforcements
Many popular Bayesian nonparametric priors can be characterized in terms of
exchangeable species sampling sequences. However, in some applications,
exchangeability may not be appropriate. We introduce a {novel and
probabilistically coherent family of non-exchangeable species sampling
sequences characterized by a tractable predictive probability function with
weights driven by a sequence of independent Beta random variables. We compare
their theoretical clustering properties with those of the Dirichlet Process and
the two parameters Poisson-Dirichlet process. The proposed construction
provides a complete characterization of the joint process, differently from
existing work. We then propose the use of such process as prior distribution in
a hierarchical Bayes modeling framework, and we describe a Markov Chain Monte
Carlo sampler for posterior inference. We evaluate the performance of the prior
and the robustness of the resulting inference in a simulation study, providing
a comparison with popular Dirichlet Processes mixtures and Hidden Markov
Models. Finally, we develop an application to the detection of chromosomal
aberrations in breast cancer by leveraging array CGH data.Comment: For correspondence purposes, Edoardo M. Airoldi's email is
[email protected]; Federico Bassetti's email is
[email protected]; Michele Guindani's email is
[email protected] ; Fabrizo Leisen's email is
[email protected]. To appear in the Journal of the American
Statistical Associatio
Action classification using a discriminative non-parametric hidden Markov model
We classify human actions occurring in videos, using the skeletal joint positions extracted from a depth image sequence as features. Each action class is represented by a non-parametric Hidden Markov Model (NP-HMM) and the model parameters are learnt in a discriminative way. Specifically, we use a Bayesian framework based on Hierarchical Dirichlet Process (HDP) to automatically infer the cardinality of hidden states and formulate a discriminative function based on distance between Gaussian distributions to improve classification performance. We use elliptical slice sampling to efficiently sample parameters from the complex posterior distribution induced by our discriminative likelihood function. We illustrate our classification results for action class models trained using this technique
Diffusive hidden Markov model characterization of DNA looping dynamics in tethered particle experiments
In many biochemical processes, proteins bound to DNA at distant sites are
brought into close proximity by loops in the underlying DNA. For example, the
function of some gene-regulatory proteins depends on such DNA looping
interactions. We present a new technique for characterizing the kinetics of
loop formation in vitro, as observed using the tethered particle method, and
apply it to experimental data on looping induced by lambda repressor. Our
method uses a modified (diffusive) hidden Markov analysis that directly
incorporates the Brownian motion of the observed tethered bead. We compare
looping lifetimes found with our method (which we find are consistent over a
range of sampling frequencies) to those obtained via the traditional
threshold-crossing analysis (which can vary depending on how the raw data are
filtered in the time domain). Our method does not involve any time filtering
and can detect sudden changes in looping behavior. For example, we show how our
method can identify transitions between long-lived, kinetically distinct states
that would otherwise be difficult to discern
You can't always sketch what you want: Understanding Sensemaking in Visual Query Systems
Visual query systems (VQSs) empower users to interactively search for line
charts with desired visual patterns, typically specified using intuitive
sketch-based interfaces. Despite decades of past work on VQSs, these efforts
have not translated to adoption in practice, possibly because VQSs are largely
evaluated in unrealistic lab-based settings. To remedy this gap in adoption, we
collaborated with experts from three diverse domains---astronomy, genetics, and
material science---via a year-long user-centered design process to develop a
VQS that supports their workflow and analytical needs, and evaluate how VQSs
can be used in practice. Our study results reveal that ad-hoc sketch-only
querying is not as commonly used as prior work suggests, since analysts are
often unable to precisely express their patterns of interest. In addition, we
characterize three essential sensemaking processes supported by our enhanced
VQS. We discover that participants employ all three processes, but in different
proportions, depending on the analytical needs in each domain. Our findings
suggest that all three sensemaking processes must be integrated in order to
make future VQSs useful for a wide range of analytical inquiries.Comment: Accepted for presentation at IEEE VAST 2019, to be held October 20-25
in Vancouver, Canada. Paper will also be published in a special issue of IEEE
Transactions on Visualization and Computer Graphics (TVCG) IEEE VIS
(InfoVis/VAST/SciVis) 2019 ACM 2012 CCS - Human-centered computing,
Visualization, Visualization design and evaluation method
A statistical analysis of memory CD8 T cell differentiation: An application of a hierarchical state space model to a short time course microarray experiment
CD8 T cells are specialized immune cells that play an important role in the
regulation of antiviral immune response and the generation of protective
immunity. In this paper we investigate the differentiation of memory CD8 T
cells in the immune response using a short time course microarray experiment.
Structurally, this experiment is similar to many in that it involves
measurements taken on independent samples, in one biological group, at a small
number of irregularly spaced time points, and exhibiting patterns of temporal
nonstationarity. To analyze this CD8 T-cell experiment, we develop a
hierarchical state space model so that we can: (1) detect temporally
differentially expressed genes, (2) identify the direction of successive
changes over time, and (3) assess the magnitude of successive changes over
time. We incorporate hidden Markov models into our model to utilize the
information embedded in the time series and set up the proposed hierarchical
state space model in an empirical Bayes framework to utilize the population
information from the large-scale data. Analysis of the CD8 T-cell experiment
using the proposed model results in biologically meaningful findings. Temporal
patterns involved in the differentiation of memory CD8 T cells are summarized
separately and performance of the proposed model is illustrated in a simulation
study.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS118 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Conjugate Bayes for probit regression via unified skew-normal distributions
Regression models for dichotomous data are ubiquitous in statistics. Besides
being useful for inference on binary responses, these methods serve also as
building blocks in more complex formulations, such as density regression,
nonparametric classification and graphical models. Within the Bayesian
framework, inference proceeds by updating the priors for the coefficients,
typically set to be Gaussians, with the likelihood induced by probit or logit
regressions for the responses. In this updating, the apparent absence of a
tractable posterior has motivated a variety of computational methods, including
Markov Chain Monte Carlo routines and algorithms which approximate the
posterior. Despite being routinely implemented, Markov Chain Monte Carlo
strategies face mixing or time-inefficiency issues in large p and small n
studies, whereas approximate routines fail to capture the skewness typically
observed in the posterior. This article proves that the posterior distribution
for the probit coefficients has a unified skew-normal kernel, under Gaussian
priors. Such a novel result allows efficient Bayesian inference for a wide
class of applications, especially in large p and small-to-moderate n studies
where state-of-the-art computational methods face notable issues. These
advances are outlined in a genetic study, and further motivate the development
of a wider class of conjugate priors for probit models along with methods to
obtain independent and identically distributed samples from the unified
skew-normal posterior
- …