4 research outputs found
Memory Efficient Principal Component Analysis for the Dimensionality Reduction of Large Mass Spectrometry Imaging Data Sets
A memory efficient algorithm for
the computation of principal component
analysis (PCA) of large mass spectrometry imaging data sets is presented.
Mass spectrometry imaging (MSI) enables two- and three-dimensional
overviews of hundreds of unlabeled molecular species in complex samples
such as intact tissue. PCA, in combination with data binning or other
reduction algorithms, has been widely used in the unsupervised processing
of MSI data and as a dimentionality reduction method prior to clustering
and spatial segmentation. Standard implementations of PCA require
the data to be stored in random access memory. This imposes an upper
limit on the amount of data that can be processed, necessitating a
compromise between the number of pixels and the number of peaks to
include. With increasing interest in multivariate analysis of large
3D multislice data sets and ongoing improvements in instrumentation,
the ability to retain all pixels and many more peaks is increasingly
important. We present a new method which has no limitation on the
number of pixels and allows an increased number of peaks to be retained.
The new technique was validated against the MATLAB (The MathWorks
Inc., Natick, Massachusetts) implementation of PCA (<i>princomp</i>) and then used to reduce, without discarding peaks or pixels, multiple
serial sections acquired from a single mouse brain which was too large
to be analyzed with <i>princomp</i>. Then, <i>k</i>-means clustering was performed on the reduced data set. We further
demonstrate with simulated data of 83 slices, comprising 20 535
pixels per slice and equaling 44 GB of data, that the new method can
be used in combination with existing tools to process an entire organ.
MATLAB code implementing the memory efficient PCA algorithm is provided
Two-Phase and Graph-Based Clustering Methods for Accurate and Efficient Segmentation of Large Mass Spectrometry Images
Clustering
is widely used in MSI to segment anatomical features
and differentiate tissue types, but existing approaches are both CPU
and memory-intensive, limiting their application to small, single
data sets. We propose a new approach that uses a graph-based algorithm
with a two-phase sampling method that overcomes this limitation. We
demonstrate the algorithm on a range of sample types and show that
it can segment anatomical features that are not identified using commonly
employed algorithms in MSI, and we validate our results on synthetic
MSI data. We show that the algorithm is robust to fluctuations in
data quality by successfully clustering data with a designed-in variance
using data acquired with varying laser fluence. Finally, we show that
this method is capable of generating accurate segmentations of large
MSI data sets acquired on the newest generation of MSI instruments
and evaluate these results by comparison with histopathology
Hyperspectral Visualization of Mass Spectrometry Imaging Data
The acquisition of localized molecular spectra with mass
spectrometry
imaging (MSI) has a great, but as yet not fully realized, potential
for biomedical diagnostics and research. The methodology generates
a series of mass spectra from discrete sample locations, which is
often analyzed by visually interpreting specifically selected images
of individual masses. We developed an intuitive color-coding scheme
based on hyperspectral imaging methods to generate a single overview
image of this complex data set. The image color-coding is based on
spectral characteristics, such that pixels with similar molecular
profiles are displayed with similar colors. This visualization strategy
was applied to results of principal component analysis, self-organizing
maps and t-distributed stochastic neighbor embedding. Our approach
for MSI data analysis, combining automated data processing, modeling
and display, is user-friendly and allows both the spatial and molecular
information to be visualized intuitively and effectively
Hyperspectral Visualization of Mass Spectrometry Imaging Data
The acquisition of localized molecular spectra with mass
spectrometry
imaging (MSI) has a great, but as yet not fully realized, potential
for biomedical diagnostics and research. The methodology generates
a series of mass spectra from discrete sample locations, which is
often analyzed by visually interpreting specifically selected images
of individual masses. We developed an intuitive color-coding scheme
based on hyperspectral imaging methods to generate a single overview
image of this complex data set. The image color-coding is based on
spectral characteristics, such that pixels with similar molecular
profiles are displayed with similar colors. This visualization strategy
was applied to results of principal component analysis, self-organizing
maps and t-distributed stochastic neighbor embedding. Our approach
for MSI data analysis, combining automated data processing, modeling
and display, is user-friendly and allows both the spatial and molecular
information to be visualized intuitively and effectively