24,660 research outputs found
Convolutional LSTM Networks for Subcellular Localization of Proteins
Machine learning is widely used to analyze biological sequence data.
Non-sequential models such as SVMs or feed-forward neural networks are often
used although they have no natural way of handling sequences of varying length.
Recurrent neural networks such as the long short term memory (LSTM) model on
the other hand are designed to handle sequences. In this study we demonstrate
that LSTM networks predict the subcellular location of proteins given only the
protein sequence with high accuracy (0.902) outperforming current state of the
art algorithms. We further improve the performance by introducing convolutional
filters and experiment with an attention mechanism which lets the LSTM focus on
specific parts of the protein. Lastly we introduce new visualizations of both
the convolutional filters and the attention mechanisms and show how they can be
used to extract biological relevant knowledge from the LSTM networks
- …