4,859 research outputs found
Application of new probabilistic graphical models in the genetic regulatory networks studies
This paper introduces two new probabilistic graphical models for
reconstruction of genetic regulatory networks using DNA microarray data. One is
an Independence Graph (IG) model with either a forward or a backward search
algorithm and the other one is a Gaussian Network (GN) model with a novel
greedy search method. The performances of both models were evaluated on four
MAPK pathways in yeast and three simulated data sets. Generally, an IG model
provides a sparse graph but a GN model produces a dense graph where more
information about gene-gene interactions is preserved. Additionally, we found
two key limitations in the prediction of genetic regulatory networks using DNA
microarray data, the first is the sufficiency of sample size and the second is
the complexity of network structures may not be captured without additional
data at the protein level. Those limitations are present in all prediction
methods which used only DNA microarray data.Comment: 38 pages, 3 figure
Partially linear additive quantile regression in ultra-high dimension
We consider a flexible semiparametric quantile regression model for analyzing
high dimensional heterogeneous data. This model has several appealing features:
(1) By considering different conditional quantiles, we may obtain a more
complete picture of the conditional distribution of a response variable given
high dimensional covariates. (2) The sparsity level is allowed to be different
at different quantile levels. (3) The partially linear additive structure
accommodates nonlinearity and circumvents the curse of dimensionality. (4) It
is naturally robust to heavy-tailed distributions. In this paper, we
approximate the nonlinear components using B-spline basis functions. We first
study estimation under this model when the nonzero components are known in
advance and the number of covariates in the linear part diverges. We then
investigate a nonconvex penalized estimator for simultaneous variable selection
and estimation. We derive its oracle property for a general class of nonconvex
penalty functions in the presence of ultra-high dimensional covariates under
relaxed conditions. To tackle the challenges of nonsmooth loss function,
nonconvex penalty function and the presence of nonlinear components, we combine
a recently developed convex-differencing method with modern empirical process
techniques. Monte Carlo simulations and an application to a microarray study
demonstrate the effectiveness of the proposed method. We also discuss how the
method for a single quantile of interest can be extended to simultaneous
variable selection and estimation at multiple quantiles.Comment: Published at http://dx.doi.org/10.1214/15-AOS1367 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …