We present ChromAlignNet, a deep learning model for alignment of peaks in Gas
Chromatography-Mass Spectrometry (GC-MS) data. In GC-MS data, a compound's
retention time (RT) may not stay fixed across multiple chromatograms. To use
GC-MS data for biomarker discovery requires alignment of identical analyte's RT
from different samples. Current methods of alignment are all based on a set of
formal, mathematical rules. We present a solution to GC-MS alignment using deep
learning neural networks, which are more adept at complex, fuzzy data sets. We
tested our model on several GC-MS data sets of various complexities and
analysed the alignment results quantitatively. We show the model has very good
performance (AUC ∼1 for simple data sets and AUC ∼0.85 for very
complex data sets). Further, our model easily outperforms existing algorithms
on complex data sets. Compared with existing methods, ChromAlignNet is very
easy to use as it requires no user input of reference chromatograms and
parameters. This method can easily be adapted to other similar data such as
those from liquid chromatography. The source code is written in Python and
available online