SAPNet: a deep learning model for identification of single-molecule peptide post-translational modifications with surface enhanced Raman spectroscopy

Abstract

Nanopore resistive pulse sensors are emerging technologies for single-molecule protein sequencing. But they can hardly detect small post-translational modifications (PTMs) such as hydroxylation in single-molecule level. While a combination of surface enhanced Raman spectroscopy (SERS) with plasmonic nanopores can detect the small PTMs, the blinking Raman peaks in the single-molecule SERS spectra leads to a big challenge in data analysis and PTM identification. Herein, we developed and validated a one-dimensional convolutional neural network (1D-CNN) for amino acids and peptides identification from their PTMs including hydroxylation and phosphorylation by their single-molecule SERS spectra, named Single Amino acid and Peptide Network (SAPNet). Our work combines cutting-edge plasmonic nanopore technology for SERS signal acquisition and deep learning for fully automated extraction of information from the SERS signals. The SAPNet model achieved an overall accuracy of 99.66% for the identification of amino acids from their modification, and 98.38% for the identification of peptides from their PTM translation. We also evaluated the model with out-of-sample examples with good performance. Our work can be beneficial for early detection of diseases such as cancers and Alzheimer's disease.Comment: 20 pages, 5 figures, 2 table

    Similar works

    Full text

    thumbnail-image

    Available Versions