Phosphorylation is central to numerous fundamental cellular processes,
influencing the onset and progression of a variety of diseases. The correct
identification of these phosphorylation sites is of great importance to unravel
the intricate molecular mechanisms within cells and during viral infections,
potentially leading to the discovery of new therapeutic targets. In this study,
we introduce PTransIPs, a novel deep learning model for the identification of
phosphorylation sites. PTransIPs treat amino acids within protein sequences as
words, extracting unique encodings based on their type and sequential position.
The model also incorporates embeddings from large pretrained protein models as
additional data inputs. PTransIPS is further trained on a combination model of
convolutional neural network with residual connections and Transformer model
equipped with multi-head attention mechanisms. At last, the model outputs
classification results through a fully connected layer. The results of
independent testing reveal that PTransIPs outperforms existing
state-of-the-art(SOTA) methods, achieving AUROCs of 0.9232 and 0.9660 for
identifying phosphorylated S/T and Y sites respectively. In addition, ablation
studies prove that pretrained model embeddings contribute to the performance of
PTransIPs. Furthermore, PTransIPs has interpretable amino acid preference,
visible training process and shows generalizability on other bioactivity
classification tasks. To facilitate usage, our code and data are publicly
accessible at \url{https://github.com/StatXzy7/PTransIPs}