53,185 research outputs found
Guide to Spectral Proper Orthogonal Decomposition
This paper discusses the spectral proper orthogonal decomposition and its use in identifying modes, or structures, in flow data. A specific algorithm based on estimating the cross-spectral density tensor with Welch’s method is presented, and guidance is provided on selecting data sampling parameters and understanding tradeoffs among them in terms of bias, variability, aliasing, and leakage. Practical implementation issues, including dealing with large datasets, are discussed and illustrated with examples involving experimental and computational turbulent flow data
Spectral Sequence Motif Discovery
Sequence discovery tools play a central role in several fields of
computational biology. In the framework of Transcription Factor binding
studies, motif finding algorithms of increasingly high performance are required
to process the big datasets produced by new high-throughput sequencing
technologies. Most existing algorithms are computationally demanding and often
cannot support the large size of new experimental data. We present a new motif
discovery algorithm that is built on a recent machine learning technique,
referred to as Method of Moments. Based on spectral decompositions, this method
is robust under model misspecification and is not prone to locally optimal
solutions. We obtain an algorithm that is extremely fast and designed for the
analysis of big sequencing data. In a few minutes, we can process datasets of
hundreds of thousand sequences and extract motif profiles that match those
computed by various state-of-the-art algorithms.Comment: 20 pages, 3 figures, 1 tabl
Recursive Compressed Sensing
We introduce a recursive algorithm for performing compressed sensing on
streaming data. The approach consists of a) recursive encoding, where we sample
the input stream via overlapping windowing and make use of the previous
measurement in obtaining the next one, and b) recursive decoding, where the
signal estimate from the previous window is utilized in order to achieve faster
convergence in an iterative optimization scheme applied to decode the new one.
To remove estimation bias, a two-step estimation procedure is proposed
comprising support set detection and signal amplitude estimation. Estimation
accuracy is enhanced by a non-linear voting method and averaging estimates over
multiple windows. We analyze the computational complexity and estimation error,
and show that the normalized error variance asymptotically goes to zero for
sublinear sparsity. Our simulation results show speed up of an order of
magnitude over traditional CS, while obtaining significantly lower
reconstruction error under mild conditions on the signal magnitudes and the
noise level.Comment: Submitted to IEEE Transactions on Information Theor
On statistical approaches to generate Level 3 products from satellite remote sensing retrievals
Satellite remote sensing of trace gases such as carbon dioxide (CO) has
increased our ability to observe and understand Earth's climate. However, these
remote sensing data, specifically~Level 2 retrievals, tend to be irregular in
space and time, and hence, spatio-temporal prediction is required to infer
values at any location and time point. Such inferences are not only required to
answer important questions about our climate, but they are also needed for
validating the satellite instrument, since Level 2 retrievals are generally not
co-located with ground-based remote sensing instruments. Here, we discuss
statistical approaches to construct Level 3 products from Level 2 retrievals,
placing particular emphasis on the strengths and potential pitfalls when using
statistical prediction in this context. Following this discussion, we use a
spatio-temporal statistical modelling framework known as fixed rank kriging
(FRK) to obtain global predictions and prediction standard errors of
column-averaged carbon dioxide based on Version 7r and Version 8r retrievals
from the Orbiting Carbon Observatory-2 (OCO-2) satellite. The FRK predictions
allow us to validate statistically the Level 2 retrievals globally even though
the data are at locations and at time points that do not coincide with
validation data. Importantly, the validation takes into account the prediction
uncertainty, which is dependent both on the temporally-varying density of
observations around the ground-based measurement sites and on the
spatio-temporal high-frequency components of the trace gas field that are not
explicitly modelled. Here, for validation of remotely-sensed CO data, we
use observations from the Total Carbon Column Observing Network. We demonstrate
that the resulting FRK product based on Version 8r compares better with TCCON
data than that based on Version 7r.Comment: 28 pages, 10 figures, 4 table
- …