22,784 research outputs found
The effect of missing data on robust Bayesian spectral analysis
This is the author accepted manuscript. The final version is available from the publisher via the DOI in this record.Published in:
Machine Learning for Signal Processing (MLSP), 2013 IEEE International Workshop on
Date of Conference:
22-25 Sept. 2013We investigate the effects of missing observations on the
robust Bayesian model for spectral analysis introduced by
Christmas [2013]. The model assumes Student-t distributed
noise and uses an automatic relevance determination prior on
the precisions of the amplitudes of the component sinusoids
and it is not obvious what their effect will be when some of
the otherwise temporally uniformly sampled data is missing
Noise and nonlinearities in high-throughput data
High-throughput data analyses are becoming common in biology, communications,
economics and sociology. The vast amounts of data are usually represented in
the form of matrices and can be considered as knowledge networks. Spectra-based
approaches have proved useful in extracting hidden information within such
networks and for estimating missing data, but these methods are based
essentially on linear assumptions. The physical models of matching, when
applicable, often suggest non-linear mechanisms, that may sometimes be
identified as noise. The use of non-linear models in data analysis, however,
may require the introduction of many parameters, which lowers the statistical
weight of the model. According to the quality of data, a simpler linear
analysis may be more convenient than more complex approaches.
In this paper, we show how a simple non-parametric Bayesian model may be used
to explore the role of non-linearities and noise in synthetic and experimental
data sets.Comment: 12 pages, 3 figure
Accounting for Calibration Uncertainties in X-ray Analysis: Effective Areas in Spectral Fitting
While considerable advance has been made to account for statistical
uncertainties in astronomical analyses, systematic instrumental uncertainties
have been generally ignored. This can be crucial to a proper interpretation of
analysis results because instrumental calibration uncertainty is a form of
systematic uncertainty. Ignoring it can underestimate error bars and introduce
bias into the fitted values of model parameters. Accounting for such
uncertainties currently requires extensive case-specific simulations if using
existing analysis packages. Here we present general statistical methods that
incorporate calibration uncertainties into spectral analysis of high-energy
data. We first present a method based on multiple imputation that can be
applied with any fitting method, but is necessarily approximate. We then
describe a more exact Bayesian approach that works in conjunction with a Markov
chain Monte Carlo based fitting. We explore methods for improving computational
efficiency, and in particular detail a method of summarizing calibration
uncertainties with a principal component analysis of samples of plausible
calibration files. This method is implemented using recently codified Chandra
effective area uncertainties for low-resolution spectral analysis and is
verified using both simulated and actual Chandra data. Our procedure for
incorporating effective area uncertainty is easily generalized to other types
of calibration uncertainties.Comment: 61 pages double spaced, 8 figures, accepted for publication in Ap
DART-ID increases single-cell proteome coverage.
Analysis by liquid chromatography and tandem mass spectrometry (LC-MS/MS) can identify and quantify thousands of proteins in microgram-level samples, such as those comprised of thousands of cells. This process, however, remains challenging for smaller samples, such as the proteomes of single mammalian cells, because reduced protein levels reduce the number of confidently sequenced peptides. To alleviate this reduction, we developed Data-driven Alignment of Retention Times for IDentification (DART-ID). DART-ID implements principled Bayesian frameworks for global retention time (RT) alignment and for incorporating RT estimates towards improved confidence estimates of peptide-spectrum-matches. When applied to bulk or to single-cell samples, DART-ID increased the number of data points by 30-50% at 1% FDR, and thus decreased missing data. Benchmarks indicate excellent quantification of peptides upgraded by DART-ID and support their utility for quantitative analysis, such as identifying cell types and cell-type specific proteins. The additional datapoints provided by DART-ID boost the statistical power and double the number of proteins identified as differentially abundant in monocytes and T-cells. DART-ID can be applied to diverse experimental designs and is freely available at http://dart-id.slavovlab.net
Recognition of Harmonic Sounds in Polyphonic Audio using a Missing Feature Approach: Extended Report
A method based on local spectral features and missing feature techniques
is proposed for the recognition of harmonic sounds in mixture
signals. A mask estimation algorithm is proposed for identifying
spectral regions that contain reliable information for each sound
source and then bounded marginalization is employed to treat the
feature vector elements that are determined as unreliable. The proposed
method is tested on musical instrument sounds due to the
extensive availability of data but it can be applied on other sounds
(i.e. animal sounds, environmental sounds), whenever these are harmonic.
In simulations the proposed method clearly outperformed a
baseline method for mixture signals
SZ contribution to characterize the shape of galaxy cluster haloes
We present the on-going activity to characterize the geometrical properties of the gas and dark matter haloes using multi-wavelength observations of galaxy clusters. The role of the SZ signal in describing the gas distribution is discussed for the pilot case of the CLASH object MACS J1206.2-0847
- …