16 research outputs found

    Parametric Power Spectral Density Analysis of Noise from Instrumentation in MALDI TOF Mass Spectrometry

    Get PDF
    Noise in mass spectrometry can interfere with identification of the biochemical substances in the sample. For example, the electric motors and circuits inside the mass spectrometer or in nearby equipment generate random noise that may distort the true shape of mass spectra. This paper presents a stochastic signal processing approach to analyzing noise from electrical noise sources (i.e., noise from instrumentation) in MALDI TOF mass spectrometry. Noise from instrumentation was hypothesized to be a mixture of thermal noise, 1/f noise, and electric or magnetic interference in the instrument. Parametric power spectral density estimation was conducted to derive the power distribution of noise from instrumentation with respect to frequencies. As expected, the experimental results show that noise from instrumentation contains 1/f noise and prominent periodic components in addition to thermal noise. These periodic components imply that the mass spectrometers used in this study may not be completely shielded from the internal or external electrical noise sources. However, according to a simulation study of human plasma mass spectra, noise from instrumentation does not seem to affect mass spectra significantly. In conclusion, analysis of noise from instrumentation using stochastic signal processing here provides an intuitive perspective on how to quantify noise in mass spectrometry through spectral modeling

    Parametric power spectral density analysis of noise from instrumentation in MALDI TOF mass spectrometry

    Get PDF
    Noise in mass spectrometry can interfere with identification of the biochemical substances in the sample. For example, the electric motors and circuits inside the mass spectrometer or in nearby equipment generate random noise that may distort the true shape of mass spectra. This paper presents a stochastic signal processing approach to analyzing noise from electrical noise sources (i.e., noise from instrumentation) in MALDI TOF mass spectrometry. Noise from instrumentation was hypothesized to be a mixture of thermal noise, 1/f noise, and electric or magnetic interference in the instrument. Parametric power spectral density estimation was conducted to derive the power distribution of noise from instrumentation with respect to frequencies. As expected, the experimental results show that noise from instrumentation contains 1/f noise and prominent periodic components in addition to thermal noise. These periodic components imply that the mass spectrometers used in this study may not be completely shielded from the internal or external electrical noise sources. However, according to a simulation study of human plasma mass spectra, noise from instrumentation does not seem to affect mass spectra significantly. In conclusion, analysis of noise from instrumentation using stochastic signal processing here provides an intuitive perspective on how to quantify noise in mass spectrometry through spectral modeling

    Mass spectrometry data mining for cancer detection

    Get PDF
    Early detection of cancer is crucial for successful intervention strategies. Mass spectrometry-based high throughput proteomics is recognized as a major breakthrough in cancer detection. Many machine learning methods have been used to construct classifiers based on mass spectrometry data for discriminating between cancer stages, yet, the classifiers so constructed generally lack biological interpretability. To better assist clinical uses, a key step is to discover ”biomarker signature profiles”, i.e. combinations of a small number of protein biomarkers strongly discriminating between cancer states. This dissertation introduces two innovative algorithms to automatically search for a signature and to construct a high-performance signature-based classifier for cancer discrimination tasks based on mass spectrometry data, such as data acquired by MALDI or SELDI techniques. Our first algorithm assumes that homogeneous groups of mass spectra can be modeled by (unknown) Gibbs distributions to generate an optimal signature and an associated signature-based classifier by robust log-likelihood analysis; our second algorithm uses a stochastic optimization algorithm to search for two lists of biomarkers, and then constructs a signature-based classifier. To support these two algorithms theoretically, this dissertation also studies the empirical probability distributions of mass spectrometry data and implements the actual fitting of Markov random fields to these high-dimensional distributions. We have validated our two signature discovery algorithms on several mass spectrometry datasets related to ovarian cancer and to colorectal cancer patients groups. For these cancer discrimination tasks, our algorithms have yielded better classification performances than existing machine learning algorithms and in addition,have generated more interpretable explicit signatures.Mathematics, Department o

    Automated peak identification for time -of -flight mass spectroscopy

    Get PDF
    The high throughput capabilities of protein mass fingerprints measurements have made mass spectrometry one of the standard tools for proteomic research, such as biomarker discovery. However, the analysis of large raw data sets produced by the time-of-flight (TOF) spectrometers creates a bottleneck in the discovery process. One specific challenge is the preprocessing and identification of mass peaks corresponding to important biological molecules. The accuracy of mass assignment is another limitation when comparing mass fingerprints with databases.;We have developed an automated peak picking algorithm based on a maximum likelihood approach that effectively and efficiently detects peaks in a time-of-flight secondary ion mass spectrum. This approach produces maximum likelihood estimates of peak positions and amplitudes, and simultaneously develops estimates of the uncertainties in each of these quantities. We demonstrate that a Poisson process is involved for time-of-flight secondary ion mass spectrometry (TOF-SIMS) and the algorithm takes the character of the Poisson noise into account.;Though this peak picking algorithm was initially developed for TOF-SIMS spectra, it can be extended to other types of TOF spectra as soon as the correct noise characteristics are considered. We have developed a peak alignment procedure that aligns peaks in different spectra. This is a crucial step for multivariate analysis. Multivariate analysis is often used to distill useful information from complex spectra.;We have designed a TOF-SIMS experiment that consists of various mixtures of three bio-molecules as a model for more complicated biomarker discovery. The peak picking algorithm is applied to the collected spectra. The algorithm detects peaks in the spectra repeatably and accurately. We also show that there are patterns in the spectra of pure biomolecules samples. Furthermore, we show it is possible to infer the concentration ratios in the mixture samples by checking the strength of the patterns

    Topics in learning sparse and low-rank models of non-negative data

    Get PDF
    Advances in information and measurement technology have led to a surge in prevalence of high-dimensional data. Sparse and low-rank modeling can both be seen as techniques of dimensionality reduction, which is essential for obtaining compact and interpretable representations of such data. In this thesis, we investigate aspects of sparse and low-rank modeling in conjunction with non-negative data or non-negativity constraints. The first part is devoted to the problem of learning sparse non-negative representations, with a focus on how non-negativity can be taken advantage of. We work out a detailed analysis of non-negative least squares regression, showing that under certain conditions sparsity-promoting regularization, the approach advocated paradigmatically over the past years, is not required. Our results have implications for problems in signal processing such as compressed sensing and spike train deconvolution. In the second part, we consider the problem of factorizing a given matrix into two factors of low rank, out of which one is binary. We devise a provably correct algorithm computing such factorization whose running time is exponential only in the rank of the factorization, but linear in the dimensions of the input matrix. Our approach is extended to noisy settings and applied to an unmixing problem in DNA methylation array analysis. On the theoretical side, we relate the uniqueness of the factorization to Littlewood-Offord theory in combinatorics.Fortschritte in Informations- und Messtechnologie führen zu erhöhtem Vorkommen hochdimensionaler Daten. Modellierungsansätze basierend auf Sparsity oder niedrigem Rang können als Dimensionsreduktion betrachtet werden, die notwendig ist, um kompakte und interpretierbare Darstellungen solcher Daten zu erhalten. In dieser Arbeit untersuchen wir Aspekte dieser Ansätze in Verbindung mit nichtnegativen Daten oder Nichtnegativitätsbeschränkungen. Der erste Teil handelt vom Lernen nichtnegativer sparsamer Darstellungen, mit einem Schwerpunkt darauf, wie Nichtnegativität ausgenutzt werden kann. Wir analysieren nichtnegative kleinste Quadrate im Detail und zeigen, dass unter gewissen Bedingungen Sparsity-fördernde Regularisierung - der in den letzten Jahren paradigmatisch enpfohlene Ansatz - nicht notwendig ist. Unsere Resultate haben Auswirkungen auf Probleme in der Signalverarbeitung wie Compressed Sensing und die Entfaltung von Pulsfolgen. Im zweiten Teil betrachten wir das Problem, eine Matrix in zwei Faktoren mit niedrigem Rang, von denen einer binär ist, zu zerlegen. Wir entwickeln dafür einen Algorithmus, dessen Laufzeit nur exponentiell in dem Rang der Faktorisierung, aber linear in den Dimensionen der gegebenen Matrix ist. Wir erweitern unseren Ansatz für verrauschte Szenarien und wenden ihn zur Analyse von DNA-Methylierungsdaten an. Auf theoretischer Ebene setzen wir die Eindeutigkeit der Faktorisierung in Beziehung zur Littlewood-Offord-Theorie aus der Kombinatorik

    Topics in learning sparse and low-rank models of non-negative data

    Get PDF
    Advances in information and measurement technology have led to a surge in prevalence of high-dimensional data. Sparse and low-rank modeling can both be seen as techniques of dimensionality reduction, which is essential for obtaining compact and interpretable representations of such data. In this thesis, we investigate aspects of sparse and low-rank modeling in conjunction with non-negative data or non-negativity constraints. The first part is devoted to the problem of learning sparse non-negative representations, with a focus on how non-negativity can be taken advantage of. We work out a detailed analysis of non-negative least squares regression, showing that under certain conditions sparsity-promoting regularization, the approach advocated paradigmatically over the past years, is not required. Our results have implications for problems in signal processing such as compressed sensing and spike train deconvolution. In the second part, we consider the problem of factorizing a given matrix into two factors of low rank, out of which one is binary. We devise a provably correct algorithm computing such factorization whose running time is exponential only in the rank of the factorization, but linear in the dimensions of the input matrix. Our approach is extended to noisy settings and applied to an unmixing problem in DNA methylation array analysis. On the theoretical side, we relate the uniqueness of the factorization to Littlewood-Offord theory in combinatorics.Fortschritte in Informations- und Messtechnologie führen zu erhöhtem Vorkommen hochdimensionaler Daten. Modellierungsansätze basierend auf Sparsity oder niedrigem Rang können als Dimensionsreduktion betrachtet werden, die notwendig ist, um kompakte und interpretierbare Darstellungen solcher Daten zu erhalten. In dieser Arbeit untersuchen wir Aspekte dieser Ansätze in Verbindung mit nichtnegativen Daten oder Nichtnegativitätsbeschränkungen. Der erste Teil handelt vom Lernen nichtnegativer sparsamer Darstellungen, mit einem Schwerpunkt darauf, wie Nichtnegativität ausgenutzt werden kann. Wir analysieren nichtnegative kleinste Quadrate im Detail und zeigen, dass unter gewissen Bedingungen Sparsity-fördernde Regularisierung - der in den letzten Jahren paradigmatisch enpfohlene Ansatz - nicht notwendig ist. Unsere Resultate haben Auswirkungen auf Probleme in der Signalverarbeitung wie Compressed Sensing und die Entfaltung von Pulsfolgen. Im zweiten Teil betrachten wir das Problem, eine Matrix in zwei Faktoren mit niedrigem Rang, von denen einer binär ist, zu zerlegen. Wir entwickeln dafür einen Algorithmus, dessen Laufzeit nur exponentiell in dem Rang der Faktorisierung, aber linear in den Dimensionen der gegebenen Matrix ist. Wir erweitern unseren Ansatz für verrauschte Szenarien und wenden ihn zur Analyse von DNA-Methylierungsdaten an. Auf theoretischer Ebene setzen wir die Eindeutigkeit der Faktorisierung in Beziehung zur Littlewood-Offord-Theorie aus der Kombinatorik

    Acoustic Waves

    Get PDF
    The concept of acoustic wave is a pervasive one, which emerges in any type of medium, from solids to plasmas, at length and time scales ranging from sub-micrometric layers in microdevices to seismic waves in the Sun's interior. This book presents several aspects of the active research ongoing in this field. Theoretical efforts are leading to a deeper understanding of phenomena, also in complicated environments like the solar surface boundary. Acoustic waves are a flexible probe to investigate the properties of very different systems, from thin inorganic layers to ripening cheese to biological systems. Acoustic waves are also a tool to manipulate matter, from the delicate evaporation of biomolecules to be analysed, to the phase transitions induced by intense shock waves. And a whole class of widespread microdevices, including filters and sensors, is based on the behaviour of acoustic waves propagating in thin layers. The search for better performances is driving to new materials for these devices, and to more refined tools for their analysis

    Statistical methods for differential proteomics at peptide and protein level

    Get PDF
    corecore