Nanopore resistive pulse sensors are emerging technologies for
single-molecule protein sequencing. But they can hardly detect small
post-translational modifications (PTMs) such as hydroxylation in
single-molecule level. While a combination of surface enhanced Raman
spectroscopy (SERS) with plasmonic nanopores can detect the small PTMs, the
blinking Raman peaks in the single-molecule SERS spectra leads to a big
challenge in data analysis and PTM identification. Herein, we developed and
validated a one-dimensional convolutional neural network (1D-CNN) for amino
acids and peptides identification from their PTMs including hydroxylation and
phosphorylation by their single-molecule SERS spectra, named Single Amino acid
and Peptide Network (SAPNet). Our work combines cutting-edge plasmonic nanopore
technology for SERS signal acquisition and deep learning for fully automated
extraction of information from the SERS signals. The SAPNet model achieved an
overall accuracy of 99.66% for the identification of amino acids from their
modification, and 98.38% for the identification of peptides from their PTM
translation. We also evaluated the model with out-of-sample examples with good
performance. Our work can be beneficial for early detection of diseases such as
cancers and Alzheimer's disease.Comment: 20 pages, 5 figures, 2 table