152 research outputs found

    FaaPred: A SVM-Based Prediction Method for Fungal Adhesins and Adhesin-Like Proteins

    Get PDF
    Adhesion constitutes one of the initial stages of infection in microbial diseases and is mediated by adhesins. Hence, identification and comprehensive knowledge of adhesins and adhesin-like proteins is essential to understand adhesin mediated pathogenesis and how to exploit its therapeutic potential. However, the knowledge about fungal adhesins is rudimentary compared to that of bacterial adhesins. In addition to host cell attachment and mating, the fungal adhesins play a significant role in homotypic and xenotypic aggregation, foraging and biofilm formation. Experimental identification of fungal adhesins is labor- as well as time-intensive. In this work, we present a Support Vector Machine (SVM) based method for the prediction of fungal adhesins and adhesin-like proteins. The SVM models were trained with different compositional features, namely, amino acid, dipeptide, multiplet fractions, charge and hydrophobic compositions, as well as PSI-BLAST derived PSSM matrices. The best classifiers are based on compositional properties as well as PSSM and yield an overall accuracy of 86%. The prediction method based on best classifiers is freely accessible as a world wide web based server at http://bioinfo.icgeb.res.in/faap. This work will aid rapid and rational identification of fungal adhesins, expedite the pace of experimental characterization of novel fungal adhesins and enhance our knowledge about role of adhesins in fungal infections

    Convolutional LSTM Networks for Subcellular Localization of Proteins

    Get PDF
    Machine learning is widely used to analyze biological sequence data. Non-sequential models such as SVMs or feed-forward neural networks are often used although they have no natural way of handling sequences of varying length. Recurrent neural networks such as the long short term memory (LSTM) model on the other hand are designed to handle sequences. In this study we demonstrate that LSTM networks predict the subcellular location of proteins given only the protein sequence with high accuracy (0.902) outperforming current state of the art algorithms. We further improve the performance by introducing convolutional filters and experiment with an attention mechanism which lets the LSTM focus on specific parts of the protein. Lastly we introduce new visualizations of both the convolutional filters and the attention mechanisms and show how they can be used to extract biological relevant knowledge from the LSTM networks

    Selection of T cell epitopes from S. mansoni Sm23 protein as a vaccine construct, using Immunoinformatics approach

    Get PDF
    Schistosomiasis, a neglected and most prevalenttropical diseases after malaria, have been a threat to people living in endemic areas. With regards to possible resistance to the popular drug (praziquantel) use for treatment of schistosomiasis, the need for a permanent vaccinating approach has been justified. This study uses an in silico approach to identify potential target vaccine candidate or T cell epitopes (T cell response activating epitope) for the treatment of schistosomiasis. This research therefore identified some candidate T cell epitopes from Sm23 protein of Schistosma mansoni using immunoinformatics tools. Nonameric epitopes like 85YMYAFFLVV93 , 83MLYMYAFFL91 , 8MRCLKSCVF16 , 41SQYGDNLHK49 and 104VAVVYKDRI112 was found to exhibit strong binding affinity with some human leukocyte antigen (HLA). The predicted epitope was found to have no similarity with human proteome, a good attribute that is conferred on any good vaccine candidate. The predicted epitopes provide promising drug candidates and could be tested by wet laboratory as targeted vaccine against S. mansoni infection

    Incorporating significant amino acid pairs to identify O-linked glycosylation sites on transmembrane proteins and non-transmembrane proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>While occurring enzymatically in biological systems, O-linked glycosylation affects protein folding, localization and trafficking, protein solubility, antigenicity, biological activity, as well as cell-cell interactions on membrane proteins. Catalytic enzymes involve glycotransferases, sugar-transferring enzymes and glycosidases which trim specific monosaccharides from precursors to form intermediate structures. Due to the difficulty of experimental identification, several works have used computational methods to identify glycosylation sites.</p> <p>Results</p> <p>By investigating glycosylated sites that contain various motifs between Transmembrane (TM) and non-Transmembrane (non-TM) proteins, this work presents a novel method, GlycoRBF, that implements radial basis function (RBF) networks with significant amino acid pairs (SAAPs) for identifying O-linked glycosylated serine and threonine on TM proteins and non-TM proteins. Additionally, a membrane topology is considered for reducing the false positives on glycosylated TM proteins. Based on an evaluation using five-fold cross-validation, the consideration of a membrane topology can reduce 31.4% of the false positives when identifying O-linked glycosylation sites on TM proteins. Via an independent test, GlycoRBF outperforms previous O-linked glycosylation site prediction schemes.</p> <p>Conclusion</p> <p>A case study of Cyclic AMP-dependent transcription factor ATF-6 alpha was presented to demonstrate the effectiveness of GlycoRBF. Web-based GlycoRBF, which can be accessed at <url>http://GlycoRBF.bioinfo.tw</url>, can identify O-linked glycosylated serine and threonine effectively and efficiently. Moreover, the structural topology of Transmembrane (TM) proteins with glycosylation sites is provided to users. The stand-alone version of GlycoRBF is also available for high throughput data analysis.</p

    Identification of potential tissue-specific cancer biomarkers and development of cancer versus normal genomic classifiers

    Get PDF
    Machine learning techniques for cancer prediction and biomarker discovery can hasten cancer detection and significantly improve prognosis. Recent “OMICS” studies which include a variety of cancer and normal tissue samples along with machine learning approaches have the potential to further accelerate such discovery. To demonstrate this potential, 2,175 gene expression samples from nine tissue types were obtained to identify gene sets whose expression is characteristic of each cancer class. Using random forests classification and ten-fold cross-validation, we developed nine single-tissue classifiers, two multi-tissue cancer-versus-normal classifiers, and one multi-tissue normal classifier. Given a sample of a specified tissue type, the single-tissue models classified samples as cancer or normal with a testing accuracy between 85.29% and 100%. Given a sample of non-specific tissue type, the multitissue bi-class model classified the sample as cancer versus normal with a testing accuracy of 97.89%. Given a sample of non-specific tissue type, the multi-tissue multiclass model classified the sample as cancer versus normal and as a specific tissue type with a testing accuracy of 97.43%. Given a normal sample of any of the nine tissue types, the multi-tissue normal model classified the sample as a particular tissue type with a testing accuracy of 97.35%. The machine learning classifiers developed in this study identify potential cancer biomarkers with sensitivity and specificity that exceed those of existing biomarkers and pointed to pathways that are critical to tissuespecific tumor development. This study demonstrates the feasibility of predicting the tissue origin of carcinoma in the context of multiple cancer classes

    Classification of patients with parkinsonian syndromes using medical imaging and artificial intelligence algorithms

    Get PDF
    The distinction of Parkinsonian Syndromes (PS) is challenging due to similarities of symptoms and signs at early stages of disease. Thus, the need of accurate methods for differential diagnosis at those early stages has emerged. To improve the evaluation of medical images, artificial intelligence turns out to be a useful tool. Parkinson’s Disease, the commonest PS, is characterized by the degeneration of dopamine neurons in the substantia nigra which is detected by the dopamine transporter scan (DaTscanTM), a single photon-emission tomography (SPECT) exam that uses of a radiotracer that binds dopamine receptors. In fact, by using such exam it was possible to identify a sub-group of PD patients known as “Scans without evidence of dopaminergic deficit” (SWEDD) that present a normal exam, unlike PD patients. In this study, an approach based on Convolutional Neural Networks (CNNs) was proposed for classifying PD patients, SWEDD patients and healthy subjects using SPECT and Magnetic Resonance Imaging (MRI) images. Then, these images were divided into subsets of slices in the axial view that contains particular regions of interest since 2D images are the norm in clinical practice. The classifier evaluation was performed with Cohen’s Kappa and Receiver Operating Characteristic (ROC) curve. The results obtained allow to conclude that the CNN using imaging information of the Basal Ganglia and the mesencephalon was able to distinguish PD patients from healthy subjects since achieved 97.4% accuracy using MRI and 92.4% accuracy using SPECT, and PD from SWEDD with 97.3% accuracy using MRI and 93.3% accuracy using SPECT. Nonetheless, using the same approach, it was not possible to discriminate SWEDD patients from healthy subjects (60% accuracy) using DaTscanTM and MRI. These results allow to conclude that this approach may be a useful tool to aid in PD diagnosis in the future

    Identification of membrane protein types via deep residual hypergraph neural network

    Get PDF
    A membrane protein's functions are significantly associated with its type, so it is crucial to identify the types of membrane proteins. Conventional computational methods for identifying the species of membrane proteins tend to ignore two issues: High-order correlation among membrane proteins and the scenarios of multi-modal representations of membrane proteins, which leads to information loss. To tackle those two issues, we proposed a deep residual hypergraph neural network (DRHGNN), which enhances the hypergraph neural network (HGNN) with initial residual and identity mapping in this paper. We carried out extensive experiments on four benchmark datasets of membrane proteins. In the meantime, we compared the DRHGNN with recently developed advanced methods. Experimental results showed the better performance of DRHGNN on the membrane protein classification task on four datasets. Experiments also showed that DRHGNN can handle the over-smoothing issue with the increase of the number of model layers compared with HGNN. The code is available at https://github.com/yunfighting/Identification-of-Membrane-Protein-Types-via-deep-residual-hypergraph-neural-network
    • …
    corecore