334,758 research outputs found
Feature Detection Techniques for Preprocessing Proteomic Data
Numerous gel-based and nongel-based technologies are used to detect protein changes potentially
associated with disease. The raw data, however, are abundant with technical and structural complexities, making statistical analysis a difficult task. Low-level analysis issues (including normalization, background correction, gel and/or spectral alignment, feature detection, and image registration) are substantial problems that need to be addressed, because any large-level data analyses
are contingent on appropriate and statistically sound low-level procedures. Feature detection approaches are particularly interesting due to the increased computational speed associated with subsequent calculations. Such summary data corresponding to image features provide a significant reduction in overall data size and structure while retaining key information. In this paper, we focus
on recent advances in feature detection as a tool for preprocessing proteomic data.
This work highlights existing and newly developed feature detection algorithms for proteomic
datasets, particularly relating to time-of-flight mass spectrometry, and two-dimensional gel electrophoresis. Note, however, that the associated data structures (i.e., spectral data, and images
containing spots) used as input for these methods are obtained via all gel-based and nongel-based
methods discussed in this manuscript, and thus the discussed methods are likewise applicable
Algorithm engineering for optimal alignment of protein structure distance matrices
Protein structural alignment is an important problem in computational
biology. In this paper, we present first successes on provably optimal pairwise
alignment of protein inter-residue distance matrices, using the popular Dali
scoring function. We introduce the structural alignment problem formally, which
enables us to express a variety of scoring functions used in previous work as
special cases in a unified framework. Further, we propose the first
mathematical model for computing optimal structural alignments based on dense
inter-residue distance matrices. We therefore reformulate the problem as a
special graph problem and give a tight integer linear programming model. We
then present algorithm engineering techniques to handle the huge integer linear
programs of real-life distance matrix alignment problems. Applying these
techniques, we can compute provably optimal Dali alignments for the very first
time
Topological network alignment uncovers biological function and phylogeny
Sequence comparison and alignment has had an enormous impact on our
understanding of evolution, biology, and disease. Comparison and alignment of
biological networks will likely have a similar impact. Existing network
alignments use information external to the networks, such as sequence, because
no good algorithm for purely topological alignment has yet been devised. In
this paper, we present a novel algorithm based solely on network topology, that
can be used to align any two networks. We apply it to biological networks to
produce by far the most complete topological alignments of biological networks
to date. We demonstrate that both species phylogeny and detailed biological
function of individual proteins can be extracted from our alignments.
Topology-based alignments have the potential to provide a completely new,
independent source of phylogenetic information. Our alignment of the
protein-protein interaction networks of two very different species--yeast and
human--indicate that even distant species share a surprising amount of network
topology with each other, suggesting broad similarities in internal cellular
wiring across all life on Earth.Comment: Algorithm explained in more details. Additional analysis adde
Peak Alignment of Gas Chromatography-Mass Spectrometry Data with Deep Learning
We present ChromAlignNet, a deep learning model for alignment of peaks in Gas
Chromatography-Mass Spectrometry (GC-MS) data. In GC-MS data, a compound's
retention time (RT) may not stay fixed across multiple chromatograms. To use
GC-MS data for biomarker discovery requires alignment of identical analyte's RT
from different samples. Current methods of alignment are all based on a set of
formal, mathematical rules. We present a solution to GC-MS alignment using deep
learning neural networks, which are more adept at complex, fuzzy data sets. We
tested our model on several GC-MS data sets of various complexities and
analysed the alignment results quantitatively. We show the model has very good
performance (AUC for simple data sets and AUC for very
complex data sets). Further, our model easily outperforms existing algorithms
on complex data sets. Compared with existing methods, ChromAlignNet is very
easy to use as it requires no user input of reference chromatograms and
parameters. This method can easily be adapted to other similar data such as
those from liquid chromatography. The source code is written in Python and
available online
- …