80 research outputs found
Testing significance of features by lassoed principal components
We consider the problem of testing the significance of features in
high-dimensional settings. In particular, we test for differentially-expressed
genes in a microarray experiment. We wish to identify genes that are associated
with some type of outcome, such as survival time or cancer type. We propose a
new procedure, called Lassoed Principal Components (LPC), that builds upon
existing methods and can provide a sizable improvement. For instance, in the
case of two-class data, a standard (albeit simple) approach might be to compute
a two-sample -statistic for each gene. The LPC method involves projecting
these conventional gene scores onto the eigenvectors of the gene expression
data covariance matrix and then applying an penalty in order to de-noise
the resulting projections. We present a theoretical framework under which LPC
is the logical choice for identifying significant genes, and we show that LPC
can provide a marked reduction in false discovery rates over the conventional
methods on both real and simulated data. Moreover, this flexible procedure can
be applied to a variety of types of data and can be used to improve many
existing methods for the identification of significant features.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS182 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Tree-Values: selective inference for regression trees
We consider conducting inference on the output of the Classification and
Regression Tree (CART) [Breiman et al., 1984] algorithm. A naive approach to
inference that does not account for the fact that the tree was estimated from
the data will not achieve standard guarantees, such as Type 1 error rate
control and nominal coverage. Thus, we propose a selective inference framework
for conducting inference on a fitted CART tree. In a nutshell, we condition on
the fact that the tree was estimated from the data. We propose a test for the
difference in the mean response between a pair of terminal nodes that controls
the selective Type 1 error rate, and a confidence interval for the mean
response within a single terminal node that attains the nominal selective
coverage. Efficient algorithms for computing the necessary conditioning sets
are provided. We apply these methods in simulation and to a dataset involving
the association between portion control interventions and caloric intake
w_{\infty} Algebras, Conformal Mechanics, and Black Holes
We discuss BPS solitons in gauged , D=4 supergravity. The
solitons represent extremal black holes interpolating between different vacua
of anti-de Sitter spaces. The isometry superalgebras are determined and the
motion of a superparticle in the extremal black hole background is studied and
confronted with superconformal mechanics. We show that the Virasoro symmetry of
conformal mechanics, which describes the dynamics of the superparticle near the
horizon of the extremal black hole under consideration, extends to a symmetry
under the algebra of area-preserving diffeomorphisms. We find that
a Virasoro subalgebra of can be associated to the Virasoro algebra
of the asymptotic symmetries of . In this way spacetime diffeomorphisms
of translate into diffeomorphisms in phase space: our system offers an
explicit realization of the correspondence. Using the
dimensionally reduced action, the central charge is computed. Finally, we also
present generalizations of superconformal mechanics which are invariant under
and superextensions of .Comment: Latex, 23 pages, minor errors corrected, references added; final
version to appear in Class. Quant. Gra
Fermions and noncommutative emergent gravity II: Curved branes in extra dimensions
We study fermions coupled to Yang-Mills matrix models from the point of view
of emergent gravity. The matrix model Dirac operator provides an appropriate
coupling for fermions to the effective gravitational metric for general branes
with nontrivial embedding, albeit with a non-standard spin connection. This
generalizes previous results for 4-dimensional matrix models. Integrating out
the fermions in a nontrivial geometrical background induces indeed the
Einstein-Hilbert action of the effective metric, as well as additional terms
which couple the Poisson tensor to the Riemann tensor, and a dilaton-like term.Comment: 34 pages; minor change
On the assessment of statistical significance of three-dimensional colocalization of sets of genomic elements
A growing body of experimental evidence supports the hypothesis that the 3D structure of chromatin in the nucleus is closely linked to important functional processes, including DNA replication and gene regulation. In support of this hypothesis, several research groups have examined sets of functionally associated genomic loci, with the aim of determining whether those loci are statistically significantly colocalized. This work presents a critical assessment of two previously reported analyses, both of which used genome-wide DNA–DNA interaction data from the yeast Saccharomyces cerevisiae, and both of which rely upon a simple notion of the statistical significance of colocalization. We show that these previous analyses rely upon a faulty assumption, and we propose a correct non-parametric resampling approach to the same problem. Applying this approach to the same data set does not support the hypothesis that transcriptionally coregulated genes tend to colocalize, but strongly supports the colocalization of centromeres, and provides some evidence of colocalization of origins of early DNA replication, chromosomal breakpoints and transfer RNAs
- …