7,742 research outputs found
Conditional Spectral Analysis of Replicated Multiple Time Series with Application to Nocturnal Physiology
This article considers the problem of analyzing associations between power
spectra of multiple time series and cross-sectional outcomes when data are
observed from multiple subjects. The motivating application comes from sleep
medicine, where researchers are able to non-invasively record physiological
time series signals during sleep. The frequency patterns of these signals,
which can be quantified through the power spectrum, contain interpretable
information about biological processes. An important problem in sleep research
is drawing connections between power spectra of time series signals and
clinical characteristics; these connections are key to understanding biological
pathways through which sleep affects, and can be treated to improve, health.
Such analyses are challenging as they must overcome the complicated structure
of a power spectrum from multiple time series as a complex positive-definite
matrix-valued function. This article proposes a new approach to such analyses
based on a tensor-product spline model of Cholesky components of
outcome-dependent power spectra. The approach flexibly models power spectra as
nonparametric functions of frequency and outcome while preserving geometric
constraints. Formulated in a fully Bayesian framework, a Whittle likelihood
based Markov chain Monte Carlo (MCMC) algorithm is developed for automated
model fitting and for conducting inference on associations between outcomes and
spectral measures. The method is used to analyze data from a study of sleep in
older adults and uncovers new insights into how stress and arousal are
connected to the amount of time one spends in bed
Quantum Equilibrium and the Role of Operators as Observables in Quantum Theory
Bohmian mechnaics is the most naively obvious embedding imaginable of
Schr\"odingers's equation into a completely coherent physical theory. It
describes a world in which particles move in a highly non-Newtonian sort of
way, one which may at first appear to have little to do with the spectrum of
predictions of quantum mechanics. It turns out, however, that as a consequence
of the defining dynamical equations of Bohmian mechanics, when a system has
wave function its configuration is typically random, with probability
density given by , the quantum equilibrium distribution. It
also turns out that the entire quantum formalism, operators as observables and
all the rest, naturally emerges in Bohmian mechanics from the analysis of
``measurements.'' This analysis reveals the status of operators as observables
in the description of quantum phenomena, and facilitates a clear view of the
range of applicability of the usual quantum mechanical formulas.Comment: 77 page
Mixed membership stochastic blockmodels
Observations consisting of measurements on relationships for pairs of objects
arise in many settings, such as protein interaction and gene regulatory
networks, collections of author-recipient email, and social networks. Analyzing
such data with probabilisic models can be delicate because the simple
exchangeability assumptions underlying many boilerplate models no longer hold.
In this paper, we describe a latent variable model of such data called the
mixed membership stochastic blockmodel. This model extends blockmodels for
relational data to ones which capture mixed membership latent relational
structure, thus providing an object-specific low-dimensional representation. We
develop a general variational inference algorithm for fast approximate
posterior inference. We explore applications to social and protein interaction
networks.Comment: 46 pages, 14 figures, 3 table
Spectral gene set enrichment (SGSE)
Motivation: Gene set testing is typically performed in a supervised context
to quantify the association between groups of genes and a clinical phenotype.
In many cases, however, a gene set-based interpretation of genomic data is
desired in the absence of a phenotype variable. Although methods exist for
unsupervised gene set testing, they predominantly compute enrichment relative
to clusters of the genomic variables with performance strongly dependent on the
clustering algorithm and number of clusters. Results: We propose a novel
method, spectral gene set enrichment (SGSE), for unsupervised competitive
testing of the association between gene sets and empirical data sources. SGSE
first computes the statistical association between gene sets and principal
components (PCs) using our principal component gene set enrichment (PCGSE)
method. The overall statistical association between each gene set and the
spectral structure of the data is then computed by combining the PC-level
p-values using the weighted Z-method with weights set to the PC variance scaled
by Tracey-Widom test p-values. Using simulated data, we show that the SGSE
algorithm can accurately recover spectral features from noisy data. To
illustrate the utility of our method on real data, we demonstrate the superior
performance of the SGSE method relative to standard cluster-based techniques
for testing the association between MSigDB gene sets and the variance structure
of microarray gene expression data. Availability:
http://cran.r-project.org/web/packages/PCGSE/index.html Contact:
[email protected] or [email protected]
Multinomial Inverse Regression for Text Analysis
Text data, including speeches, stories, and other document forms, are often
connected to sentiment variables that are of interest for research in
marketing, economics, and elsewhere. It is also very high dimensional and
difficult to incorporate into statistical analyses. This article introduces a
straightforward framework of sentiment-preserving dimension reduction for text
data. Multinomial inverse regression is introduced as a general tool for
simplifying predictor sets that can be represented as draws from a multinomial
distribution, and we show that logistic regression of phrase counts onto
document annotations can be used to obtain low dimension document
representations that are rich in sentiment information. To facilitate this
modeling, a novel estimation technique is developed for multinomial logistic
regression with very high-dimension response. In particular, independent
Laplace priors with unknown variance are assigned to each regression
coefficient, and we detail an efficient routine for maximization of the joint
posterior over coefficients and their prior scale. This "gamma-lasso" scheme
yields stable and effective estimation for general high-dimension logistic
regression, and we argue that it will be superior to current methods in many
settings. Guidelines for prior specification are provided, algorithm
convergence is detailed, and estimator properties are outlined from the
perspective of the literature on non-concave likelihood penalization. Related
work on sentiment analysis from statistics, econometrics, and machine learning
is surveyed and connected. Finally, the methods are applied in two detailed
examples and we provide out-of-sample prediction studies to illustrate their
effectiveness.Comment: Published in the Journal of the American Statistical Association 108,
2013, with discussion (rejoinder is here: http://arxiv.org/abs/1304.4200).
Software is available in the textir package for
- …