16,973 research outputs found
Covariate dimension reduction for survival data via the Gaussian process latent variable model
The analysis of high dimensional survival data is challenging, primarily due
to the problem of overfitting which occurs when spurious relationships are
inferred from data that subsequently fail to exist in test data. Here we
propose a novel method of extracting a low dimensional representation of
covariates in survival data by combining the popular Gaussian Process Latent
Variable Model (GPLVM) with a Weibull Proportional Hazards Model (WPHM). The
combined model offers a flexible non-linear probabilistic method of detecting
and extracting any intrinsic low dimensional structure from high dimensional
data. By reducing the covariate dimension we aim to diminish the risk of
overfitting and increase the robustness and accuracy with which we infer
relationships between covariates and survival outcomes. In addition, we can
simultaneously combine information from multiple data sources by expressing
multiple datasets in terms of the same low dimensional space. We present
results from several simulation studies that illustrate a reduction in
overfitting and an increase in predictive performance, as well as successful
detection of intrinsic dimensionality. We provide evidence that it is
advantageous to combine dimensionality reduction with survival outcomes rather
than performing unsupervised dimensionality reduction on its own. Finally, we
use our model to analyse experimental gene expression data and detect and
extract a low dimensional representation that allows us to distinguish high and
low risk groups with superior accuracy compared to doing regression on the
original high dimensional data
Principal Component Analysis as a Tool for Characterizing Black Hole Images and Variability
We explore the use of principal component analysis (PCA) to characterize
high-fidelity simulations and interferometric observations of the millimeter
emission that originates near the horizons of accreting black holes. We show
mathematically that the Fourier transforms of eigenimages derived from PCA
applied to an ensemble of images in the spatial-domain are identical to the
eigenvectors of PCA applied to the ensemble of the Fourier transforms of the
images, which suggests that this approach may be applied to modeling the sparse
interferometric Fourier-visibilities produced by an array such as the Event
Horizon Telescope (EHT). We also show that the simulations in the spatial
domain themselves can be compactly represented with a PCA-derived basis of
eigenimages allowing for detailed comparisons between variable observations and
time-dependent models, as well as for detection of outliers or rare events
within a time series of images. Furthermore, we demonstrate that the spectrum
of PCA eigenvalues is a diagnostic of the power spectrum of the structure and,
hence, of the underlying physical processes in the simulated and observed
images.Comment: 16 pages, 17 figures, submitted to Ap
- …