147,642 research outputs found
Graphical Models for Inference Under Outcome-Dependent Sampling
We consider situations where data have been collected such that the sampling
depends on the outcome of interest and possibly further covariates, as for
instance in case-control studies. Graphical models represent assumptions about
the conditional independencies among the variables. By including a node for the
sampling indicator, assumptions about sampling processes can be made explicit.
We demonstrate how to read off such graphs whether consistent estimation of the
association between exposure and outcome is possible. Moreover, we give
sufficient graphical conditions for testing and estimating the causal effect of
exposure on outcome. The practical use is illustrated with a number of
examples.Comment: Published in at http://dx.doi.org/10.1214/10-STS340 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Connectionist Inference Models
The performance of symbolic inference tasks has long been a challenge to connectionists. In this paper, we present an extended survey of this area. Existing connectionist inference systems are reviewed, with particular reference to how they perform variable binding and rule-based reasoning, and whether they involve distributed or localist representations. The benefits and disadvantages of different representations and systems are outlined, and conclusions drawn regarding the capabilities of connectionist inference systems when compared with symbolic inference systems or when used for cognitive modeling
A Potential Tale of Two by Two Tables from Completely Randomized Experiments
Causal inference in completely randomized treatment-control studies with
binary outcomes is discussed from Fisherian, Neymanian and Bayesian
perspectives, using the potential outcomes framework. A randomization-based
justification of Fisher's exact test is provided. Arguing that the crucial
assumption of constant causal effect is often unrealistic, and holds only for
extreme cases, some new asymptotic and Bayesian inferential procedures are
proposed. The proposed procedures exploit the intrinsic non-additivity of
unit-level causal effects, can be applied to linear and non-linear estimands,
and dominate the existing methods, as verified theoretically and also through
simulation studies
Bayesian model search and multilevel inference for SNP association studies
Technological advances in genotyping have given rise to hypothesis-based
association studies of increasing scope. As a result, the scientific hypotheses
addressed by these studies have become more complex and more difficult to
address using existing analytic methodologies. Obstacles to analysis include
inference in the face of multiple comparisons, complications arising from
correlations among the SNPs (single nucleotide polymorphisms), choice of their
genetic parametrization and missing data. In this paper we present an efficient
Bayesian model search strategy that searches over the space of genetic markers
and their genetic parametrization. The resulting method for Multilevel
Inference of SNP Associations, MISA, allows computation of multilevel posterior
probabilities and Bayes factors at the global, gene and SNP level, with the
prior distribution on SNP inclusion in the model providing an intrinsic
multiplicity correction. We use simulated data sets to characterize MISA's
statistical power, and show that MISA has higher power to detect association
than standard procedures. Using data from the North Carolina Ovarian Cancer
Study (NCOCS), MISA identifies variants that were not identified by standard
methods and have been externally ``validated'' in independent studies. We
examine sensitivity of the NCOCS results to prior choice and method for
imputing missing data. MISA is available in an R package on CRAN.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS322 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A hierarchical Bayesian model for inference of copy number variants and their association to gene expression
A number of statistical models have been successfully developed for the
analysis of high-throughput data from a single source, but few methods are
available for integrating data from different sources. Here we focus on
integrating gene expression levels with comparative genomic hybridization (CGH)
array measurements collected on the same subjects. We specify a measurement
error model that relates the gene expression levels to latent copy number
states which, in turn, are related to the observed surrogate CGH measurements
via a hidden Markov model. We employ selection priors that exploit the
dependencies across adjacent copy number states and investigate MCMC stochastic
search techniques for posterior inference. Our approach results in a unified
modeling framework for simultaneously inferring copy number variants (CNV) and
identifying their significant associations with mRNA transcripts abundance. We
show performance on simulated data and illustrate an application to data from a
genomic study on human cancer cell lines.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS705 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …