29,807 research outputs found
A sparse conditional Gaussian graphical model for analysis of genetical genomics data
Genetical genomics experiments have now been routinely conducted to measure
both the genetic markers and gene expression data on the same subjects. The
gene expression levels are often treated as quantitative traits and are subject
to standard genetic analysis in order to identify the gene expression
quantitative loci (eQTL). However, the genetic architecture for many gene
expressions may be complex, and poorly estimated genetic architecture may
compromise the inferences of the dependency structures of the genes at the
transcriptional level. In this paper we introduce a sparse conditional Gaussian
graphical model for studying the conditional independent relationships among a
set of gene expressions adjusting for possible genetic effects where the gene
expressions are modeled with seemingly unrelated regressions. We present an
efficient coordinate descent algorithm to obtain the penalized estimation of
both the regression coefficients and the sparse concentration matrix. The
corresponding graph can be used to determine the conditional independence among
a group of genes while adjusting for shared genetic effects. Simulation
experiments and asymptotic convergence rates and sparsistency are used to
justify our proposed methods. By sparsistency, we mean the property that all
parameters that are zero are actually estimated as zero with probability
tending to one. We apply our methods to the analysis of a yeast eQTL data set
and demonstrate that the conditional Gaussian graphical model leads to a more
interpretable gene network than a standard Gaussian graphical model based on
gene expression data alone.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS494 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
netgwas: An R Package for Network-Based Genome-Wide Association Studies
Graphical models are powerful tools for modeling and making statistical
inferences regarding complex associations among variables in multivariate data.
In this paper we introduce the R package netgwas, which is designed based on
undirected graphical models to accomplish three important and interrelated
goals in genetics: constructing linkage map, reconstructing linkage
disequilibrium (LD) networks from multi-loci genotype data, and detecting
high-dimensional genotype-phenotype networks. The netgwas package deals with
species with any chromosome copy number in a unified way, unlike other
software. It implements recent improvements in both linkage map construction
(Behrouzi and Wit, 2018), and reconstructing conditional independence network
for non-Gaussian continuous data, discrete data, and mixed
discrete-and-continuous data (Behrouzi and Wit, 2017). Such datasets routinely
occur in genetics and genomics such as genotype data, and genotype-phenotype
data. We demonstrate the value of our package functionality by applying it to
various multivariate example datasets taken from the literature. We show, in
particular, that our package allows a more realistic analysis of data, as it
adjusts for the effect of all other variables while performing pairwise
associations. This feature controls for spurious associations between variables
that can arise from classical multiple testing approach. This paper includes a
brief overview of the statistical methods which have been implemented in the
package. The main body of the paper explains how to use the package. The
package uses a parallelization strategy on multi-core processors to speed-up
computations for large datasets. In addition, it contains several functions for
simulation and visualization. The netgwas package is freely available at
https://cran.r-project.org/web/packages/netgwasComment: 32 pages, 9 figures; due to the limitation "The abstract field cannot
be longer than 1,920 characters", the abstract appearing here is slightly
shorter than that in the PDF fil
Generalized Network Psychometrics: Combining Network and Latent Variable Models
We introduce the network model as a formal psychometric model,
conceptualizing the covariance between psychometric indicators as resulting
from pairwise interactions between observable variables in a network structure.
This contrasts with standard psychometric models, in which the covariance
between test items arises from the influence of one or more common latent
variables. Here, we present two generalizations of the network model that
encompass latent variable structures, establishing network modeling as parts of
the more general framework of Structural Equation Modeling (SEM). In the first
generalization, we model the covariance structure of latent variables as a
network. We term this framework Latent Network Modeling (LNM) and show that,
with LNM, a unique structure of conditional independence relationships between
latent variables can be obtained in an explorative manner. In the second
generalization, the residual variance-covariance structure of indicators is
modeled as a network. We term this generalization Residual Network Modeling
(RNM) and show that, within this framework, identifiable models can be obtained
in which local independence is structurally violated. These generalizations
allow for a general modeling framework that can be used to fit, and compare,
SEM models, network models, and the RNM and LNM generalizations. This
methodology has been implemented in the free-to-use software package lvnet,
which contains confirmatory model testing as well as two exploratory search
algorithms: stepwise search algorithms for low-dimensional datasets and
penalized maximum likelihood estimation for larger datasets. We show in
simulation studies that these search algorithms performs adequately in
identifying the structure of the relevant residual or latent networks. We
further demonstrate the utility of these generalizations in an empirical
example on a personality inventory dataset.Comment: Published in Psychometrik
- …